Abstract |
Small networked devices with low computing power (such as phones, surveillance cameras, and IoT devices) have become an essential part of our daily lives. They collect a lot of data and apply machine learning to make useful inferences from it. Although these devices can process its collected by sending it to a remote server, For security and privacy constraints it is, sometimes, preferred to process the data locally on the device. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on these low-end devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity Android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is imple- mented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade- offs of running different neural network operations on the GPU. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support. |