Investigating the Factual Knowledge Boundary of Large Language Models
with Retrieval Augmentation
- URL: http://arxiv.org/abs/2307.11019v2
- Date: Sun, 23 Jul 2023 16:52:59 GMT
- Title: Investigating the Factual Knowledge Boundary of Large Language Models
with Retrieval Augmentation
- Authors: Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao
Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang
- Abstract summary: We show that large language models (LLMs) possess unwavering confidence in their capabilities to respond to questions.
Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries.
We also find that LLMs have a propensity to rely on the provided retrieval results when formulating answers.
- Score: 91.30946119104111
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge-intensive tasks (e.g., open-domain question answering (QA)) require
a substantial amount of factual knowledge and often rely on external
information for assistance. Recently, large language models (LLMs) (e.g.,
ChatGPT), have demonstrated impressive prowess in solving a wide range of tasks
with world knowledge, including knowledge-intensive tasks. However, it remains
unclear how well LLMs are able to perceive their factual knowledge boundaries,
particularly how they behave when incorporating retrieval augmentation. In this
study, we present an initial analysis of the factual knowledge boundaries of
LLMs and how retrieval augmentation affects LLMs on open-domain QA. Specially,
we focus on three primary research questions and analyze them by examining QA
performance, priori judgement and posteriori judgement of LLMs. We show
evidence that LLMs possess unwavering confidence in their capabilities to
respond to questions and the accuracy of their responses. Furthermore,
retrieval augmentation proves to be an effective approach in enhancing LLMs'
awareness of knowledge boundaries, thereby improving their judgemental
abilities. Additionally, we also find that LLMs have a propensity to rely on
the provided retrieval results when formulating answers, while the quality of
these results significantly impacts their reliance. The code to reproduce this
work is available at https://github.com/RUCAIBox/LLM-Knowledge-Boundary.
Related papers
- Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge.
We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation [66.01754585188739]
Large Language Models (LLMs) have been found to have difficulty knowing they do not possess certain knowledge.
Retrieval Augmentation (RA) has been extensively studied to mitigate LLMs' hallucinations.
We propose several methods to enhance LLMs' perception of knowledge boundaries and show that they are effective in reducing overconfidence.
arXiv Detail & Related papers (2024-02-18T04:57:19Z) - RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z) - Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism [0.0]
Large language models (LLMs) have demonstrated impressive language understanding and generation capabilities.
These models are not flawless and often produce responses that contain errors or misinformation.
We propose a refusal mechanism that instructs LLMs to refuse to answer challenging questions in order to avoid errors.
arXiv Detail & Related papers (2023-11-02T07:20:49Z) - Survey on Factuality in Large Language Models: Knowledge, Retrieval and
Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs)
As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z) - "Merge Conflicts!" Exploring the Impacts of External Distractors to
Parametric Knowledge Graphs [15.660128743249611]
Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge.
LLMs inevitably require external knowledge during their interactions with users.
This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge?
arXiv Detail & Related papers (2023-09-15T17:47:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.