RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge
- URL: http://arxiv.org/abs/2311.08147v1
- Date: Tue, 14 Nov 2023 13:24:19 GMT
- Title: RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge
- Authors: Yi Liu, Lianzhe Huang, Shicheng Li, Sishuo Chen, Hao Zhou, Fandong
Meng, Jie Zhou, Xu Sun
- Abstract summary: This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
- Score: 69.79676144482792
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: LLMs and AI chatbots have improved people's efficiency in various fields.
However, the necessary knowledge for answering the question may be beyond the
models' knowledge boundaries. To mitigate this issue, many researchers try to
introduce external knowledge, such as knowledge graphs and Internet contents,
into LLMs for up-to-date information. However, the external information from
the Internet may include counterfactual information that will confuse the model
and lead to an incorrect response. Thus there is a pressing need for LLMs to
possess the ability to distinguish reliable information from external
knowledge. Therefore, to evaluate the ability of LLMs to discern the
reliability of external knowledge, we create a benchmark from existing
knowledge bases. Our benchmark consists of two tasks, Question Answering and
Text Generation, and for each task, we provide models with a context containing
counterfactual information. Evaluation results show that existing LLMs are
susceptible to interference from unreliable external knowledge with
counterfactual information, and simple intervention methods make limited
contributions to the alleviation of this issue.
Related papers
- Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering [33.89176174108559]
We propose a new internal and external knowledge interactive refinement paradigm dubbed IEKR.
By simply adding a prompt like 'Tell me something about' to the LLMs, we try to review related explicit knowledge and insert them with the query into the retriever for external retrieval.
arXiv Detail & Related papers (2024-08-23T10:52:57Z) - Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge.
We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism [0.0]
Large language models (LLMs) have demonstrated impressive language understanding and generation capabilities.
These models are not flawless and often produce responses that contain errors or misinformation.
We propose a refusal mechanism that instructs LLMs to refuse to answer challenging questions in order to avoid errors.
arXiv Detail & Related papers (2023-11-02T07:20:49Z) - Self-Knowledge Guided Retrieval Augmentation for Large Language Models [59.771098292611846]
Large language models (LLMs) have shown superior performance without task-specific fine-tuning.
Retrieval-based methods can offer non-parametric world knowledge and improve the performance on tasks such as question answering.
Self-Knowledge guided Retrieval augmentation (SKR) is a simple yet effective method which can let LLMs refer to the questions they have previously encountered.
arXiv Detail & Related papers (2023-10-08T04:22:33Z) - "Merge Conflicts!" Exploring the Impacts of External Distractors to
Parametric Knowledge Graphs [15.660128743249611]
Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge.
LLMs inevitably require external knowledge during their interactions with users.
This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge?
arXiv Detail & Related papers (2023-09-15T17:47:59Z) - Knowledge Solver: Teaching LLMs to Search for Domain Knowledge from
Knowledge Graphs [19.0797968186656]
Large language models (LLMs) are versatile and can solve different tasks due to their emergent ability and generalizability.
In some previous works, additional modules like graph neural networks (GNNs) are trained on retrieved knowledge from external knowledge bases.
arXiv Detail & Related papers (2023-09-06T15:55:01Z) - Investigating the Factual Knowledge Boundary of Large Language Models
with Retrieval Augmentation [91.30946119104111]
We show that large language models (LLMs) possess unwavering confidence in their capabilities to respond to questions.
Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries.
We also find that LLMs have a propensity to rely on the provided retrieval results when formulating answers.
arXiv Detail & Related papers (2023-07-20T16:46:10Z) - Do Large Language Models Know What They Don't Know? [74.65014158544011]
Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.
Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend.
This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions.
arXiv Detail & Related papers (2023-05-29T15:30:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.