||We consider a setting where coalition members wish to use a pool of heterogeneous resources (e.g., a computational cloud or edge network) to complete their tasks. These tasks have varying resource requirements, different values and they arrive dynamically over time. Given this, we are interested in designing a resource allocation mechanism that decides how to dispatch these tasks to the resources, in order to maximise the total value derived. Existing work has demonstrated that reinforcement learning is a promising approach in these types of settings . However, that work has neglected that task owners may behave strategically and misreport the charac- teristics of their tasks when this is in their interest . In this paper, we use the framework of mechanism design to address this issue, and we show how to design a mechanism based on reinforcement learning that ensures several desirable properties, including dominant-strategy incentive compatibility and individual rationality .