||In military settings, collaborative decision making is often used to solve large and complex problems under time pres- sure. Here, workers can adopt particularly promising solutions found by colleagues (imitation) or independently explore new solutions (innovation). However, there exists a trade-off between imitation and individual innovation which has a consequential impact on the quality of the final solution found. Therefore, the design of effective collaboration strategies is an important problem when trying to find good solutions to large and complex problems. This paper formulates this strategy design as a reinforcement learning problem and presents preliminary results showing that, over short time periods, reinforcement learning outperforms most handcrafted heuristics that are typically used in these settings.