Model Pruning Enables Efficient Federated Learning on Edge Devices

Abstract	Computation and communication resources at the tactical edge are scarce, which causes challenges in applying and training models for distributed analytics in coalitions. To enable efficient data analytics, we focus on a technique known as model pruning, which iteratively removes insignificant weights from a deep neural network so that the resulting model is smaller and runs faster for both training and inference. We consider a scenario with a military base that has more processing power than edge devices; the base also has some amount of data based on which it can train an initial version of an analytics model. The analytics model is pruned to a smaller size during training and then sent to edge devices (carried by soldiers), so that at the edge, 1) the model runs faster than the original model so that the system can provide real-time analytics, 2) it is more efficient to adapt (further train) the model using data collected locally at edge devices. In this work, we present our experimentation results of model pruning and running (including training and inference) on various edge devices. We also discuss how model pruning can advance the understanding of why/how deep neural networks work in general.
Authors	Yuang Jiang (Yale) Shiqiang Wang (IBM US) Bong Jun Ko (IBM US) Wei-Han Lee (IBM US) Leandros Tassiulas (Yale)
Date	Sep-2019
Venue	Annual Fall Meeting of the DAIS ITA, 2019