Abstract |
Weconsidertheallocationofdynamicallyarrivingtaskswithvarying values and deadlines to resources within a cloud computing system. Reinforce- ment learning is a promising approach for this, but existing work has neglected that task owners may misreport their requirements strategically when this is to their benefit. To address this, we apply mechanism design and propose a novel mechanism based on reinforcement learning called Mono-RAwR that is incen- tive compatible and individually rational (i.e., truthful reporting and participation are incentivised). We evaluate our mechanism empirically on synthetic data, and we show that Mono-RAwR outperforms other state-of-the-art incentive compati- ble mechanisms by up to 70%, achieving about 80% of a greedy offline heuristic. |