Beyond Efficiency: A Systematic Survey of Resource-Efficient Large
Language Models
- URL: http://arxiv.org/abs/2401.00625v2
- Date: Thu, 4 Jan 2024 02:49:50 GMT
- Title: Beyond Efficiency: A Systematic Survey of Resource-Efficient Large
Language Models
- Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang,
Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang
Zhao
- Abstract summary: Large Language Models (LLMs) bring forth challenges in the high consumption of computational, memory, energy, and financial resources.
This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs.
- Score: 34.327846901536425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by
sophisticated models like OpenAI's ChatGPT, represents a significant
advancement in artificial intelligence. These models, however, bring forth
substantial challenges in the high consumption of computational, memory,
energy, and financial resources, especially in environments with limited
resource capabilities. This survey aims to systematically address these
challenges by reviewing a broad spectrum of techniques designed to enhance the
resource efficiency of LLMs. We categorize methods based on their optimization
focus: computational, memory, energy, financial, and network resources and
their applicability across various stages of an LLM's lifecycle, including
architecture design, pretraining, finetuning, and system design. Additionally,
the survey introduces a nuanced categorization of resource efficiency
techniques by their specific resource types, which uncovers the intricate
relationships and mappings between various resources and corresponding
optimization techniques. A standardized set of evaluation metrics and datasets
is also presented to facilitate consistent and fair comparisons across
different models and techniques. By offering a comprehensive overview of the
current sota and identifying open research avenues, this survey serves as a
foundational reference for researchers and practitioners, aiding them in
developing more sustainable and efficient LLMs in a rapidly evolving landscape.
Related papers
- A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges [60.546677053091685]
Large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain.
We explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation.
We highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications.
arXiv Detail & Related papers (2024-06-15T16:11:35Z) - Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey [48.06362354403557]
This survey reviews the literature, mainly from 2019 to 2024, on efficient resource allocation and workload scheduling strategies for large-scale distributed DL.
We highlight critical challenges for each topic and discuss key insights of existing technologies.
This survey aims to encourage computer science, artificial intelligence, and communications researchers to understand recent advances.
arXiv Detail & Related papers (2024-06-12T11:51:44Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - A Survey of Resource-efficient LLM and Multimodal Foundation Models [22.60868015887625]
Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and multimodal models, are revolutionizing the entire machine learning lifecycle.
However, the substantial advancements in versatility and performance these models offer come at a significant cost in terms of hardware resources.
This survey delves into the critical importance of such research, examining both algorithmic and systemic aspects.
arXiv Detail & Related papers (2024-01-16T03:35:26Z) - Training and Serving System of Foundation Models: A Comprehensive Survey [32.0115390377174]
This paper extensively explores the methods employed in training and serving foundation models from various perspectives.
It provides a detailed categorization of these state-of-the-art methods, including finer aspects such as network, computing, and storage.
arXiv Detail & Related papers (2024-01-05T05:27:15Z) - Towards Efficient Generative Large Language Model Serving: A Survey from
Algorithms to Systems [14.355768064425598]
generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data.
However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency.
This survey addresses the imperative need for efficient LLM serving methodologies from a machine learning system (MLSys) research perspective.
arXiv Detail & Related papers (2023-12-23T11:57:53Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - A Survey on Model Compression for Large Language Models [23.354025348567077]
Large Language Models (LLMs) have revolutionized natural language processing tasks with remarkable success.
Their formidable size and computational demands present significant challenges for practical deployment.
The field of model compression has emerged as a pivotal research area to alleviate these limitations.
arXiv Detail & Related papers (2023-08-15T08:31:05Z) - A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP)
This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z) - Information Extraction in Low-Resource Scenarios: Survey and Perspective [60.67550275379953]
Information Extraction seeks to derive structured information from unstructured texts.
This paper presents a review of neural approaches to low-resource IE from emphtraditional and emphLLM-based perspectives.
arXiv Detail & Related papers (2022-02-16T13:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.