Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
- URL: http://arxiv.org/abs/2312.05934v3
- Date: Tue, 30 Jan 2024 11:58:10 GMT
- Title: Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
- Authors: Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Abstract summary: Large language models (LLMs) encapsulate a vast amount of factual information within their pre-trained weights.
This knowledge is inherently limited, relying heavily on the characteristics of the training data.
We compare two common approaches: unsupervised fine-tuning and retrieval-augmented generation.
- Score: 0.5461938536945721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) encapsulate a vast amount of factual information
within their pre-trained weights, as evidenced by their ability to answer
diverse questions across different domains. However, this knowledge is
inherently limited, relying heavily on the characteristics of the training
data. Consequently, using external datasets to incorporate new information or
refine the capabilities of LLMs on previously seen information poses a
significant challenge. In this study, we compare two common approaches:
unsupervised fine-tuning and retrieval-augmented generation (RAG). We evaluate
both approaches on a variety of knowledge-intensive tasks across different
topics. Our findings reveal that while unsupervised fine-tuning offers some
improvement, RAG consistently outperforms it, both for existing knowledge
encountered during training and entirely new knowledge. Moreover, we find that
LLMs struggle to learn new factual information through unsupervised
fine-tuning, and that exposing them to numerous variations of the same fact
during training could alleviate this problem.
Related papers
- Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment [56.87031484108484]
Large Language Models (LLMs) are increasingly recognized for their practical applications.
Retrieval-Augmented Generation (RAG) tackles this challenge and has shown a significant impact on LLMs.
By minimizing retrieval requests that yield neutral or harmful results, we can effectively reduce both time and computational costs.
arXiv Detail & Related papers (2024-11-09T15:12:28Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning [13.371405067535814]
This paper investigates the effectiveness ofSupervised Fine-Tuning (SFT) as a method for knowledge injection in Large Language Models (LLMs)
We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information.
Our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.
arXiv Detail & Related papers (2024-03-30T01:56:07Z) - KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs.
KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z) - A Closer Look at the Limitations of Instruction Tuning [52.587607091917214]
We show that Instruction Tuning (IT) fails to enhance knowledge or skills in large language models (LLMs)
We also show that popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model.
Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets.
arXiv Detail & Related papers (2024-02-03T04:45:25Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.