Smart at what cost? Characterising Mobile Deep Neural Networks in the
wild
- URL: http://arxiv.org/abs/2109.13963v1
- Date: Tue, 28 Sep 2021 18:09:29 GMT
- Title: Smart at what cost? Characterising Mobile Deep Neural Networks in the
wild
- Authors: Mario Almeida, Stefanos Laskaridis, Abhinav Mehrotra, Lukasz Dudziak,
Ilias Leontiadis, Nicholas D. Lane
- Abstract summary: This paper is the first holistic study of Deep Neural Network (DNN) usage in the wild.
We analyse over 16k of the most popular apps in the Google Play Store.
We measure the models' energy footprint, as a core cost dimension of any mobile deployment.
- Score: 16.684419342012674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With smartphones' omnipresence in people's pockets, Machine Learning (ML) on
mobile is gaining traction as devices become more powerful. With applications
ranging from visual filters to voice assistants, intelligence on mobile comes
in many forms and facets. However, Deep Neural Network (DNN) inference remains
a compute intensive workload, with devices struggling to support intelligence
at the cost of responsiveness.On the one hand, there is significant research on
reducing model runtime requirements and supporting deployment on embedded
devices. On the other hand, the strive to maximise the accuracy of a task is
supported by deeper and wider neural networks, making mobile deployment of
state-of-the-art DNNs a moving target.
In this paper, we perform the first holistic study of DNN usage in the wild
in an attempt to track deployed models and match how these run on widely
deployed devices. To this end, we analyse over 16k of the most popular apps in
the Google Play Store to characterise their DNN usage and performance across
devices of different capabilities, both across tiers and generations.
Simultaneously, we measure the models' energy footprint, as a core cost
dimension of any mobile deployment. To streamline the process, we have
developed gaugeNN, a tool that automates the deployment, measurement and
analysis of DNNs on devices, with support for different frameworks and
platforms. Results from our experience study paint the landscape of deep
learning deployments on smartphones and indicate their popularity across app
developers. Furthermore, our study shows the gap between bespoke techniques and
real-world deployments and the need for optimised deployment of deep learning
models in a highly dynamic and heterogeneous ecosystem.
Related papers
- EPAM: A Predictive Energy Model for Mobile AI [6.451060076703027]
We introduce a comprehensive study of mobile AI applications considering different deep neural network (DNN) models and processing sources.
We measure the latency, energy consumption, and memory usage of all the models using four processing sources.
Our study highlights important insights, such as how mobile AI behaves in different applications (vision and non-vision) using CPU, GPU, and NNAPI.
arXiv Detail & Related papers (2023-03-02T09:11:23Z) - Towards Implementing Energy-aware Data-driven Intelligence for Smart
Health Applications on Mobile Platforms [4.648824029505978]
On-device deep learning frameworks are proficient in utilizing computing resources in mobile platforms seamlessly.
However, energy resources in a mobile device are typically limited.
We introduce a new framework through an energy-aware, adaptive model comprehension and realization.
arXiv Detail & Related papers (2023-02-01T15:34:24Z) - Enabling Deep Learning on Edge Devices [2.741266294612776]
Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc.
The high-performed DNNs heavily rely on intensive resource consumption.
Recently, some new emerging intelligent applications, e.g., AR/VR, mobile assistants, Internet of Things, require us to deploy DNNs on resource-constrained edge devices.
In this dissertation, we studied four edge intelligence scenarios, i.e., Inference on Edge Devices, Adaptation on Edge Devices, Learning on Edge Devices, and Edge-Server Systems
arXiv Detail & Related papers (2022-10-06T20:52:57Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision
Transformers [88.52500757894119]
Self-attention based vision transformers (ViTs) have emerged as a very competitive architecture alternative to convolutional neural networks (CNNs) in computer vision.
We introduce EdgeViTs, a new family of light-weight ViTs that, for the first time, enable attention-based vision models to compete with the best light-weight CNNs.
arXiv Detail & Related papers (2022-05-06T18:17:19Z) - Enable Deep Learning on Mobile Devices: Methods, Systems, and
Applications [46.97774949613859]
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI)
However, their superior performance comes at the considerable cost of computational complexity.
This paper provides an overview of efficient deep learning methods, systems and applications.
arXiv Detail & Related papers (2022-04-25T16:52:48Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - How to Reach Real-Time AI on Consumer Devices? Solutions for
Programmable and Custom Architectures [7.085772863979686]
Deep neural networks (DNNs) have led to large strides in various Artificial Intelligence (AI) inference tasks, such as object and speech recognition.
deploying such AI models across commodity devices faces significant challenges.
We present techniques for achieving real-time performance following a cross-stack approach.
arXiv Detail & Related papers (2021-06-21T11:23:12Z) - Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning
and Compiler Optimization [56.3111706960878]
High-end mobile platforms serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications.
constrained computation and storage resources on these devices pose significant challenges for real-time inference executions.
We propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices.
arXiv Detail & Related papers (2020-04-22T03:18:23Z) - The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural
Language Understanding [97.85957811603251]
We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.
Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks.
A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm.
arXiv Detail & Related papers (2020-02-19T03:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.