Related papers: Towards Incremental Learning in Large Language Models: A Critical Review

Towards Incremental Learning in Large Language Models: A Critical Review

URL: http://arxiv.org/abs/2404.18311v4
Date: Sun, 5 May 2024 08:46:32 GMT
Title: Towards Incremental Learning in Large Language Models: A Critical Review
Authors: Mladjan Jovanovic, Peter Voss,
Abstract summary: This review provides a comprehensive analysis of incremental learning in Large Language Models. It synthesizes the state-of-the-art incremental learning paradigms, including continual learning, meta-learning, parameter-efficient learning, and mixture-of-experts learning. An important finding is that many of these approaches do not update the core model, and none of them update incrementally in real-time.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Incremental learning is the ability of systems to acquire knowledge over time, enabling their adaptation and generalization to novel tasks. It is a critical ability for intelligent, real-world systems, especially when data changes frequently or is limited. This review provides a comprehensive analysis of incremental learning in Large Language Models. It synthesizes the state-of-the-art incremental learning paradigms, including continual learning, meta-learning, parameter-efficient learning, and mixture-of-experts learning. We demonstrate their utility for incremental learning by describing specific achievements from these related topics and their critical factors. An important finding is that many of these approaches do not update the core model, and none of them update incrementally in real-time. The paper highlights current problems and challenges for future research in the field. By consolidating the latest relevant research developments, this review offers a comprehensive understanding of incremental learning and its implications for designing and developing LLM-based learning systems.

Related papers

Effective LLM Knowledge Learning via Model Generalization [73.16975077770765]
Large language models (LLMs) are trained on enormous documents that contain extensive world knowledge. It is still not well-understood how knowledge is acquired via autoregressive pre-training. In this paper, we focus on understanding and improving LLM knowledge learning.
arXiv Detail & Related papers (2025-03-05T17:56:20Z)
GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? [36.987716816134984]
We propose GrowOVER-QA and GrowOVER-Dialogue, dynamic open-domain QA and dialogue benchmarks that undergo a continuous cycle of updates. Our research indicates that retrieval-augmented language models (RaLMs) struggle with knowledge that has not been trained on or recently updated. We introduce a novel retrieval-interactive language model framework, where the language model evaluates and reflects on its answers for further re-retrieval.
arXiv Detail & Related papers (2024-06-09T01:16:04Z)
Enhancing Generative Class Incremental Learning Performance with Model Forgetting Approach [50.36650300087987]
This study presents a novel approach to Generative Class Incremental Learning (GCIL) by introducing the forgetting mechanism. We have found that integrating the forgetting mechanisms significantly enhances the models' performance in acquiring new knowledge.
arXiv Detail & Related papers (2024-03-27T05:10:38Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
Online Continual Knowledge Learning for Language Models [3.654507524092343]
Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. Online Continual Knowledge Learning (OCKL) aims to manage the dynamic nature of world knowledge in LMs under real-time constraints.
arXiv Detail & Related papers (2023-11-16T07:31:03Z)
Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models [74.81091933317882]
We introduce EvolvingQA, a temporally evolving question-answering benchmark designed for training and evaluating LMs on an evolving Wikipedia database. We uncover that existing continual learning baselines suffer from updating and removing outdated knowledge. Our work aims to model the dynamic nature of real-world information, suggesting faithful evaluations of the evolution-adaptability of language models.
arXiv Detail & Related papers (2023-11-14T12:12:02Z)
When Meta-Learning Meets Online and Continual Learning: A Survey [39.53836535326121]
meta-learning is a data-driven approach to optimize the learning algorithm. Continual learning and online learning, both of which involve incrementally updating a model with streaming data. This paper organizes various problem settings using consistent terminology and formal descriptions.
arXiv Detail & Related papers (2023-11-09T09:49:50Z)
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks. However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge. We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z)
The Life Cycle of Knowledge in Big Language Models: A Survey [39.955688635216056]
Pre-trained language models (PLMs) have raised significant attention about how knowledge can be acquired, maintained, updated and used by language models. Despite the enormous amount of related studies, there still lacks a unified view of how knowledge circulates within language models throughout the learning, tuning, and application processes. We revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods, and investigating how knowledge circulates when it is built, maintained and used.
arXiv Detail & Related papers (2023-03-14T03:49:22Z)
A Comprehensive Survey of Continual Learning: Theory, Method and Application [64.23253420555989]
We present a comprehensive survey of continual learning, seeking to bridge the basic settings, theoretical foundations, representative methods, and practical applications. We summarize the general objectives of continual learning as ensuring a proper stability-plasticity trade-off and an adequate intra/inter-task generalizability in the context of resource efficiency.
arXiv Detail & Related papers (2023-01-31T11:34:56Z)
Knowledge as Invariance -- History and Perspectives of Knowledge-augmented Machine Learning [69.99522650448213]
Research in machine learning is at a turning point. Research interests are shifting away from increasing the performance of highly parameterized models to exceedingly specific tasks. This white paper provides an introduction and discussion of this emerging field in machine learning research.
arXiv Detail & Related papers (2020-12-21T15:07:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.