RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge
- URL: http://arxiv.org/abs/2311.08147v1
- Date: Tue, 14 Nov 2023 13:24:19 GMT
- Title: RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge
- Authors: Yi Liu, Lianzhe Huang, Shicheng Li, Sishuo Chen, Hao Zhou, Fandong
Meng, Jie Zhou, Xu Sun
- Abstract summary: This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
- Score: 69.79676144482792
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: LLMs and AI chatbots have improved people's efficiency in various fields.
However, the necessary knowledge for answering the question may be beyond the
models' knowledge boundaries. To mitigate this issue, many researchers try to
introduce external knowledge, such as knowledge graphs and Internet contents,
into LLMs for up-to-date information. However, the external information from
the Internet may include counterfactual information that will confuse the model
and lead to an incorrect response. Thus there is a pressing need for LLMs to
possess the ability to distinguish reliable information from external
knowledge. Therefore, to evaluate the ability of LLMs to discern the
reliability of external knowledge, we create a benchmark from existing
knowledge bases. Our benchmark consists of two tasks, Question Answering and
Text Generation, and for each task, we provide models with a context containing
counterfactual information. Evaluation results show that existing LLMs are
susceptible to interference from unreliable external knowledge with
counterfactual information, and simple intervention methods make limited
contributions to the alleviation of this issue.
Related papers
- KnowPath: Knowledge-enhanced Reasoning via LLM-generated Inference Paths over Knowledge Graphs [35.63483147113076]
Introducing external knowledge, such as knowledge graph, can enhance the LLMs' ability to provide factual answers.
KnowPath is a knowledge-enhanced large model framework driven by the collaboration of internal and external knowledge.
It relies on the internal knowledge of the LLM to guide the exploration of interpretable directed subgraphs in external knowledge graphs.
arXiv Detail & Related papers (2025-02-17T17:02:01Z) - What External Knowledge is Preferred by LLMs? Characterizing and Exploring Chain of Evidence in Imperfect Context [19.78140793942713]
This paper focuses on LLMs' preferred external knowledge in imperfect contexts when handling multi-hop QA.
Inspired by criminal procedural law's Chain of Evidence (CoE), we characterize that knowledge preferred by LLMs should maintain both relevance to the question and mutual support among knowledge pieces.
We propose an automated CoE discrimination approach and explore LLMs' preferences from their effectiveness, faithfulness and robustness, as well as CoE's usability in a naive Retrieval-Augmented Generation (RAG) case.
arXiv Detail & Related papers (2024-12-17T07:49:49Z) - Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering [33.89176174108559]
We propose a new internal and external knowledge interactive refinement paradigm dubbed IEKR.
By simply adding a prompt like 'Tell me something about' to the LLMs, we try to review related explicit knowledge and insert them with the query into the retriever for external retrieval.
arXiv Detail & Related papers (2024-08-23T10:52:57Z) - Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge.
We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - Self-Knowledge Guided Retrieval Augmentation for Large Language Models [59.771098292611846]
Large language models (LLMs) have shown superior performance without task-specific fine-tuning.
Retrieval-based methods can offer non-parametric world knowledge and improve the performance on tasks such as question answering.
Self-Knowledge guided Retrieval augmentation (SKR) is a simple yet effective method which can let LLMs refer to the questions they have previously encountered.
arXiv Detail & Related papers (2023-10-08T04:22:33Z) - "Merge Conflicts!" Exploring the Impacts of External Distractors to
Parametric Knowledge Graphs [15.660128743249611]
Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge.
LLMs inevitably require external knowledge during their interactions with users.
This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge?
arXiv Detail & Related papers (2023-09-15T17:47:59Z) - Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well.
Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries.
We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z) - Do Large Language Models Know What They Don't Know? [74.65014158544011]
Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.
Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend.
This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions.
arXiv Detail & Related papers (2023-05-29T15:30:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.