A Survey of Large-Scale Deep Learning Serving System Optimization:
Challenges and Opportunities
- URL: http://arxiv.org/abs/2111.14247v1
- Date: Sun, 28 Nov 2021 22:14:10 GMT
- Title: A Survey of Large-Scale Deep Learning Serving System Optimization:
Challenges and Opportunities
- Authors: Fuxun Yu, Di Wang, Longfei Shangguan, Minjia Zhang, Xulong Tang,
Chenchen Liu, Xiang Chen
- Abstract summary: Survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems.
Deep Learning (DL) models have achieved superior performance in many application domains, including vision, language, medical, commercial ads, entertainment, etc.
- Score: 24.38071862662089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning (DL) models have achieved superior performance in many
application domains, including vision, language, medical, commercial ads,
entertainment, etc. With the fast development, both DL applications and the
underlying serving hardware have demonstrated strong scaling trends, i.e.,
Model Scaling and Compute Scaling, for example, the recent pre-trained model
with hundreds of billions of parameters with ~TB level memory consumption, as
well as the newest GPU accelerators providing hundreds of TFLOPS. With both
scaling trends, new problems and challenges emerge in DL inference serving
systems, which gradually trends towards Large-scale Deep learning Serving
systems (LDS). This survey aims to summarize and categorize the emerging
challenges and optimization opportunities for large-scale deep learning serving
systems. By providing a novel taxonomy, summarizing the computing paradigms,
and elaborating the recent technique advances, we hope that this survey could
shed light on new optimization perspectives and motivate novel works in
large-scale deep learning system optimization.
Related papers
- Generative Large Recommendation Models: Emerging Trends in LLMs for Recommendation [85.52251362906418]
This tutorial explores two primary approaches for integrating large language models (LLMs)
It provides a comprehensive overview of generative large recommendation models, including their recent advancements, challenges, and potential research directions.
Key topics include data quality, scaling laws, user behavior mining, and efficiency in training and inference.
arXiv Detail & Related papers (2025-02-19T14:48:25Z) - Scaling New Frontiers: Insights into Large Recommendation Models [74.77410470984168]
Meta's generative recommendation model HSTU illustrates the scaling laws of recommendation systems by expanding parameters to thousands of billions.
We conduct comprehensive ablation studies to explore the origins of these scaling laws.
We offer insights into future directions for large recommendation models.
arXiv Detail & Related papers (2024-12-01T07:27:20Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation [20.851925464903804]
This paper introduces a novel learning paradigm, Dynamic Sparse Learning, tailored for recommendation models.
DSL innovatively trains a lightweight sparse model from scratch, periodically evaluating and dynamically adjusting each weight's significance.
Our experimental results underline DSL's effectiveness, significantly reducing training and inference costs while delivering comparable recommendation performance.
arXiv Detail & Related papers (2024-02-05T10:16:20Z) - Towards Efficient Generative Large Language Model Serving: A Survey from
Algorithms to Systems [14.355768064425598]
generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data.
However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency.
This survey addresses the imperative need for efficient LLM serving methodologies from a machine learning system (MLSys) research perspective.
arXiv Detail & Related papers (2023-12-23T11:57:53Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - A Survey of Serverless Machine Learning Model Inference [0.0]
Generative AI, Computer Vision, and Natural Language Processing have led to an increased integration of AI models into various products.
This survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems.
arXiv Detail & Related papers (2023-11-22T18:46:05Z) - Systems for Parallel and Distributed Large-Model Deep Learning Training [7.106986689736828]
Some recent Transformer models span hundreds of billions of learnable parameters.
These designs have introduced new scale-driven systems challenges for the DL space.
This survey will explore the large-model training systems landscape, highlighting key challenges and the various techniques that have been used to address them.
arXiv Detail & Related papers (2023-01-06T19:17:29Z) - Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.