Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
- URL: http://arxiv.org/abs/2407.01885v1
- Date: Tue, 2 Jul 2024 02:14:42 GMT
- Title: Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
- Authors: Chuanpeng Yang, Wang Lu, Yao Zhu, Yidong Wang, Qian Chen, Chenlong Gao, Bingjie Yan, Yiqiang Chen,
- Abstract summary: Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry.
The endeavor to compress language models while maintaining their accuracy has become a focal point of research.
Knowledge distillation has emerged as an effective technique to enhance inference speed without greatly compromising performance.
- Score: 21.555902498178387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have showcased exceptional capabilities in various domains, attracting significant interest from both academia and industry. Despite their impressive performance, the substantial size and computational demands of LLMs pose considerable challenges for practical deployment, particularly in environments with limited resources. The endeavor to compress language models while maintaining their accuracy has become a focal point of research. Among the various methods, knowledge distillation has emerged as an effective technique to enhance inference speed without greatly compromising performance. This paper presents a thorough survey from three aspects: method, evaluation, and application, exploring knowledge distillation techniques tailored specifically for LLMs. Specifically, we divide the methods into white-box KD and black-box KD to better illustrate their differences. Furthermore, we also explored the evaluation tasks and distillation effects between different distillation methods, and proposed directions for future research. Through in-depth understanding of the latest advancements and practical applications, this survey provides valuable resources for researchers, paving the way for sustained progress in this field.
Related papers
- Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment [56.87031484108484]
Large Language Models (LLMs) are increasingly recognized for their practical applications.
Retrieval-Augmented Generation (RAG) tackles this challenge and has shown a significant impact on LLMs.
By minimizing retrieval requests that yield neutral or harmful results, we can effectively reduce both time and computational costs.
arXiv Detail & Related papers (2024-11-09T15:12:28Z) - A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources.
We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z) - EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty.
We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications.
Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z) - A Survey on Symbolic Knowledge Distillation of Large Language Models [8.237773729114926]
Survey focuses on the emerging and critical area of symbolic knowledge distillation in Large Language Models.
Describes the core challenges, including maintaining the depth of knowledge in a comprehensible format.
Examines the various approaches and techniques that have been developed in this field.
arXiv Detail & Related papers (2024-07-12T12:18:19Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models [33.50873478562128]
Large Language Models (LLMs) bring forth challenges in the high consumption of computational, memory, energy, and financial resources.
This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs.
arXiv Detail & Related papers (2024-01-01T01:12:42Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - A Practical Survey on Zero-shot Prompt Design for In-context Learning [0.0]
Large language models (LLMs) have brought about significant improvements in Natural Language Processing(NLP) tasks.
This paper presents a comprehensive review of in-context learning techniques, focusing on different types of prompts.
We explore various approaches to prompt design, such as manual design, optimization algorithms, and evaluation methods.
arXiv Detail & Related papers (2023-09-22T23:00:34Z) - A Survey on Model Compression for Large Language Models [21.768293256849113]
Large Language Models (LLMs) have transformed natural language processing tasks successfully.
Yet, their large size and high computational needs pose challenges for practical use.
Model compression has emerged as a key research area to address these challenges.
arXiv Detail & Related papers (2023-08-15T08:31:05Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.