Towards Continual Knowledge Learning of Language Models
- URL: http://arxiv.org/abs/2110.03215v2
- Date: Fri, 8 Oct 2021 02:55:40 GMT
- Title: Towards Continual Knowledge Learning of Language Models
- Authors: Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han,
Gyeonghun Kim, Stanley Jungkyu Choi, Minjoon Seo
- Abstract summary: Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain on a vast amount of web corpus.
In real-world scenarios, the world knowledge stored in the LMs can quickly become outdated as the world changes.
We formulate a new continual learning (CL) problem called Continual Knowledge Learning (CKL)
- Score: 11.000501711652829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LMs) are known to encode world knowledge in their
parameters as they pretrain on a vast amount of web corpus, which is often
utilized for performing knowledge-dependent downstream tasks such as question
answering, fact-checking, and open dialogue. In real-world scenarios, the world
knowledge stored in the LMs can quickly become outdated as the world changes,
but it is non-trivial to avoid catastrophic forgetting and reliably acquire new
knowledge while preserving invariant knowledge. To push the community towards
better maintenance of ever-changing LMs, we formulate a new continual learning
(CL) problem called Continual Knowledge Learning (CKL). We construct a new
benchmark and metric to quantify the retention of time-invariant world
knowledge, the update of outdated knowledge, and the acquisition of new
knowledge. We adopt applicable recent methods from literature to create several
strong baselines. Through extensive experiments, we find that CKL exhibits
unique challenges that are not addressed in previous CL setups, where parameter
expansion is necessary to reliably retain and learn knowledge simultaneously.
By highlighting the critical causes of knowledge forgetting, we show that CKL
is a challenging and important problem that helps us better understand and
train ever-changing LMs.
Related papers
- Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners [0.0]
We introduce Composite Learning Units (CLUs) designed to transform reasoners into learners capable of continuous learning.
CLUs are built on an architecture that allows a reasoning model to maintain and evolve a dynamic knowledge repository.
We demonstrate CLUs' effectiveness through a cryptographic reasoning task, where they continuously evolve their understanding through feedback to uncover hidden transformation rules.
arXiv Detail & Related papers (2024-10-09T02:27:58Z) - GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? [36.987716816134984]
We propose GrowOVER-QA and GrowOVER-Dialogue, dynamic open-domain QA and dialogue benchmarks that undergo a continuous cycle of updates.
Our research indicates that retrieval-augmented language models (RaLMs) struggle with knowledge that has not been trained on or recently updated.
We introduce a novel retrieval-interactive language model framework, where the language model evaluates and reflects on its answers for further re-retrieval.
arXiv Detail & Related papers (2024-06-09T01:16:04Z) - InfuserKI: Enhancing Large Language Models with Knowledge Graphs via
Infuser-Guided Knowledge Integration [61.554209059971576]
Large Language Models (LLMs) have shown remarkable open-generation capabilities across diverse domains.
Injecting new knowledge poses the risk of forgetting previously acquired knowledge.
We propose a novel Infuser-Guided Knowledge Integration framework.
arXiv Detail & Related papers (2024-02-18T03:36:26Z) - DeepEdit: Knowledge Editing as Decoding with Constraints [118.78008395850888]
How to edit the knowledge in multi-step reasoning has become the major challenge in the knowledge editing (KE) of large language models (LLMs)
We propose a new KE framework: DEEPEDIT, which enhances LLMs's ability to generate coherent reasoning chains with new knowledge through depth-first search.
In addition to DEEPEDIT, we propose two new KE benchmarks: MQUAKE-2002 and MQUAKE-HARD, which provide more precise and challenging assessments of KE approaches.
arXiv Detail & Related papers (2024-01-19T03:48:27Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Online Continual Knowledge Learning for Language Models [3.654507524092343]
Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking.
Online Continual Knowledge Learning (OCKL) aims to manage the dynamic nature of world knowledge in LMs under real-time constraints.
arXiv Detail & Related papers (2023-11-16T07:31:03Z) - Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks.
However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge.
We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z) - Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language Models [49.23348672822087]
We propose Knowledge Crosswords, a benchmark consisting of incomplete knowledge networks bounded by structured factual constraints.
The novel setting of geometric knowledge reasoning necessitates new LM abilities beyond existing atomic/linear multi-hop QA.
We conduct extensive experiments to evaluate existing LLMs and approaches on Knowledge Crosswords.
arXiv Detail & Related papers (2023-10-02T15:43:53Z) - The Life Cycle of Knowledge in Big Language Models: A Survey [39.955688635216056]
Pre-trained language models (PLMs) have raised significant attention about how knowledge can be acquired, maintained, updated and used by language models.
Despite the enormous amount of related studies, there still lacks a unified view of how knowledge circulates within language models throughout the learning, tuning, and application processes.
We revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods, and investigating how knowledge circulates when it is built, maintained and used.
arXiv Detail & Related papers (2023-03-14T03:49:22Z) - Incremental Knowledge Based Question Answering [52.041815783025186]
We propose a new incremental KBQA learning framework that can progressively expand learning capacity as humans do.
Specifically, it comprises a margin-distilled loss and a collaborative selection method, to overcome the catastrophic forgetting problem.
The comprehensive experiments demonstrate its effectiveness and efficiency when working with the evolving knowledge base.
arXiv Detail & Related papers (2021-01-18T09:03:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.