A Survey of Large-Scale Deep Learning Serving System Optimization:
Challenges and Opportunities
- URL: http://arxiv.org/abs/2111.14247v1
- Date: Sun, 28 Nov 2021 22:14:10 GMT
- Title: A Survey of Large-Scale Deep Learning Serving System Optimization:
Challenges and Opportunities
- Authors: Fuxun Yu, Di Wang, Longfei Shangguan, Minjia Zhang, Xulong Tang,
Chenchen Liu, Xiang Chen
- Abstract summary: Survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems.
Deep Learning (DL) models have achieved superior performance in many application domains, including vision, language, medical, commercial ads, entertainment, etc.
- Score: 24.38071862662089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning (DL) models have achieved superior performance in many
application domains, including vision, language, medical, commercial ads,
entertainment, etc. With the fast development, both DL applications and the
underlying serving hardware have demonstrated strong scaling trends, i.e.,
Model Scaling and Compute Scaling, for example, the recent pre-trained model
with hundreds of billions of parameters with ~TB level memory consumption, as
well as the newest GPU accelerators providing hundreds of TFLOPS. With both
scaling trends, new problems and challenges emerge in DL inference serving
systems, which gradually trends towards Large-scale Deep learning Serving
systems (LDS). This survey aims to summarize and categorize the emerging
challenges and optimization opportunities for large-scale deep learning serving
systems. By providing a novel taxonomy, summarizing the computing paradigms,
and elaborating the recent technique advances, we hope that this survey could
shed light on new optimization perspectives and motivate novel works in
large-scale deep learning system optimization.
Related papers
- Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation [20.851925464903804]
This paper introduces a novel learning paradigm, Dynamic Sparse Learning, tailored for recommendation models.
DSL innovatively trains a lightweight sparse model from scratch, periodically evaluating and dynamically adjusting each weight's significance.
Our experimental results underline DSL's effectiveness, significantly reducing training and inference costs while delivering comparable recommendation performance.
arXiv Detail & Related papers (2024-02-05T10:16:20Z) - Towards Efficient Generative Large Language Model Serving: A Survey from
Algorithms to Systems [14.355768064425598]
generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data.
However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency.
This survey addresses the imperative need for efficient LLM serving methodologies from a machine learning system (MLSys) research perspective.
arXiv Detail & Related papers (2023-12-23T11:57:53Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - A Survey of Serverless Machine Learning Model Inference [0.0]
Generative AI, Computer Vision, and Natural Language Processing have led to an increased integration of AI models into various products.
This survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems.
arXiv Detail & Related papers (2023-11-22T18:46:05Z) - On Efficient Training of Large-Scale Deep Learning Models: A Literature
Review [90.87691246153612]
The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.
The use of large-scale models trained on vast amounts of data holds immense promise for practical applications.
With the increasing demands on computational capacity, a comprehensive summarization on acceleration techniques of training deep learning models is still much anticipated.
arXiv Detail & Related papers (2023-04-07T11:13:23Z) - Systems for Parallel and Distributed Large-Model Deep Learning Training [7.106986689736828]
Some recent Transformer models span hundreds of billions of learnable parameters.
These designs have introduced new scale-driven systems challenges for the DL space.
This survey will explore the large-model training systems landscape, highlighting key challenges and the various techniques that have been used to address them.
arXiv Detail & Related papers (2023-01-06T19:17:29Z) - Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z) - Retrieval-Enhanced Machine Learning [110.5237983180089]
We describe a generic retrieval-enhanced machine learning framework, which includes a number of existing models as special cases.
REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization.
REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-05-02T21:42:45Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.