Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
- URL: http://arxiv.org/abs/2312.05934v3
- Date: Tue, 30 Jan 2024 11:58:10 GMT
- Title: Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
- Authors: Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Abstract summary: Large language models (LLMs) encapsulate a vast amount of factual information within their pre-trained weights.
This knowledge is inherently limited, relying heavily on the characteristics of the training data.
We compare two common approaches: unsupervised fine-tuning and retrieval-augmented generation.
- Score: 0.5461938536945721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) encapsulate a vast amount of factual information
within their pre-trained weights, as evidenced by their ability to answer
diverse questions across different domains. However, this knowledge is
inherently limited, relying heavily on the characteristics of the training
data. Consequently, using external datasets to incorporate new information or
refine the capabilities of LLMs on previously seen information poses a
significant challenge. In this study, we compare two common approaches:
unsupervised fine-tuning and retrieval-augmented generation (RAG). We evaluate
both approaches on a variety of knowledge-intensive tasks across different
topics. Our findings reveal that while unsupervised fine-tuning offers some
improvement, RAG consistently outperforms it, both for existing knowledge
encountered during training and entirely new knowledge. Moreover, we find that
LLMs struggle to learn new factual information through unsupervised
fine-tuning, and that exposing them to numerous variations of the same fact
during training could alleviate this problem.
Related papers
- How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? [55.33467849079774]
Low-rank adaptation (LoRA) is a popular and efficient training technique for updating or domain-specific adaptation of Large Language Models.
We investigate how new facts can be incorporated into the LLM using LoRA without compromising the previously learned knowledge.
arXiv Detail & Related papers (2025-02-20T12:31:03Z) - Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment [56.87031484108484]
Large Language Models (LLMs) are increasingly recognized for their practical applications.
Retrieval-Augmented Generation (RAG) tackles this challenge and has shown a significant impact on LLMs.
By minimizing retrieval requests that yield neutral or harmful results, we can effectively reduce both time and computational costs.
arXiv Detail & Related papers (2024-11-09T15:12:28Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning [13.371405067535814]
This paper investigates the effectiveness ofSupervised Fine-Tuning (SFT) as a method for knowledge injection in Large Language Models (LLMs)
We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information.
Our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.
arXiv Detail & Related papers (2024-03-30T01:56:07Z) - KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs.
KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.