A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
- URL: http://arxiv.org/abs/2202.06512v1
- Date: Mon, 14 Feb 2022 07:00:31 GMT
- Title: A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
- Authors: Qiyang Zhang, Xiang Li, Xiangying Che, Xiao Ma, Ao Zhou, Mengwei Xu,
Shangguang Wang, Yun Ma, Xuanzhe Liu
- Abstract summary: We build a benchmark that includes 6 representative DL libs and 15 diversified DL models.
We then perform extensive experiments on 10 mobile devices, which help reveal a complete landscape of the current mobile DL libs ecosystem.
We find that the best-performing DL lib is severely fragmented across different models and hardware.
- Score: 12.342282138576348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deploying deep learning (DL) on mobile devices has been a notable trend in
recent years. To support fast inference of on-device DL, DL libraries play a
critical role as algorithms and hardware do. Unfortunately, no prior work ever
dives deep into the ecosystem of modern DL libs and provides quantitative
results on their performance. In this paper, we first build a comprehensive
benchmark that includes 6 representative DL libs and 15 diversified DL models.
We then perform extensive experiments on 10 mobile devices, which help reveal a
complete landscape of the current mobile DL libs ecosystem. For example, we
find that the best-performing DL lib is severely fragmented across different
models and hardware, and the gap between those DL libs can be rather huge. In
fact, the impacts of DL libs can overwhelm the optimizations from algorithms or
hardware, e.g., model quantization and GPU/DSP-based heterogeneous computing.
Finally, atop the observations, we summarize practical implications to
different roles in the DL lib ecosystem.
Related papers
- MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases [81.70591346986582]
We introduce MobileAIBench, a benchmarking framework for evaluating Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices.
MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices.
arXiv Detail & Related papers (2024-06-12T22:58:12Z) - A Survey of Deep Learning Library Testing Methods [33.62859142913532]
Deep learning (DL) libraries undertake the underlying optimization and computation.
DL libraries are not immune to bugs, which can pose serious threats to users' personal property and safety.
This paper provides an overview of the testing research related to various DL libraries.
arXiv Detail & Related papers (2024-04-27T11:42:13Z) - FusionAI: Decentralized Training and Deploying LLMs with Massive
Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU.
This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z) - Benchmark Assessment for DeepSpeed Optimization Library [1.7839986996686321]
Deep Learning (DL) models are widely used in machine learning due to their performance and ability to deal with large datasets.
The size of such datasets and the complexity of DL models cause such models to be complex, consuming large amount of resources and time to train.
Many recent libraries and applications are introduced to deal with DL complexity and efficiency issues.
arXiv Detail & Related papers (2022-02-12T04:52:28Z) - A First Look at Class Incremental Learning in Deep Learning Mobile
Traffic Classification [68.11005070665364]
We explore Incremental Learning (IL) techniques to add new classes to models without a full retraining, hence speeding up model's updates cycle.
We consider iCarl, a state of the art IL method, and MIRAGE-2019, a public dataset with traffic from 40 Android apps.
Despite our analysis reveals their infancy, IL techniques are a promising research area on the roadmap towards automated DL-based traffic analysis systems.
arXiv Detail & Related papers (2021-07-09T14:28:16Z) - Tensor Processing Primitives: A Programming Abstraction for Efficiency
and Portability in Deep Learning Workloads [86.62083829086393]
This work introduces the Processing Primitives (TPP), a programming abstraction striving for efficient, portable implementation of Deep Learning-workloads with high-productivity.
TPPs define a compact, yet versatile set of 2D-tensor operators (or a virtual ISA), which can be utilized as building-blocks to construct complex operators on high-dimensional tensors.
We demonstrate the efficacy of our approach using standalone kernels and end-to-end DL-workloads expressed entirely via TPPs that outperform state-of-the-art implementations on multiple platforms.
arXiv Detail & Related papers (2021-04-12T18:35:49Z) - An Empirical Study on Deployment Faults of Deep Learning Based Mobile
Applications [7.58063287182615]
Mobile Deep Learning (DL) apps integrate DL models trained using large-scale data with DL programs.
This paper presents the first comprehensive study on the deployment faults of mobile DL apps.
We construct a fine-granularity taxonomy consisting of 23 categories regarding to fault symptoms and distill common fix strategies for different fault types.
arXiv Detail & Related papers (2021-01-13T08:19:50Z) - A Survey of Deep Active Learning [54.376820959917005]
Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples.
Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters.
Deep active learning (DAL) has emerged.
arXiv Detail & Related papers (2020-08-30T04:28:31Z) - PolyDL: Polyhedral Optimizations for Creation of High Performance DL
primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives.
We develop novel data reuse analysis algorithms using the polyhedral model.
We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z) - The Deep Learning Compiler: A Comprehensive Survey [16.19025439622745]
We perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details.
Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects.
arXiv Detail & Related papers (2020-02-06T07:29:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.