Abstract |
We consider the allocation of dynamically arriving tasks with varying values and deadlines to resources within a cloud computing system. Reinforcement learning is a promising approach for this, but existing work has neglected that task owners may misreport their requirements strategically when this is to their benefit. To address this, we apply mechanism design and propose a novel mechanism based on reinforcement learning called Mono-RAwR that is incentive compatible and individually rational (i.e., truthful reporting and participation are incentivised). We evaluate our mechanism empirically on synthetic data, and we show that Mono-RAwR outperforms other state-of-the-art incentive compatible mechanisms by up to 70%, achieving about 80% of a greedy offline heuristic. |