Entity-Based Knowledge Conflicts in Question Answering
- URL: http://arxiv.org/abs/2109.05052v1
- Date: Fri, 10 Sep 2021 18:29:44 GMT
- Title: Entity-Based Knowledge Conflicts in Question Answering
- Authors: Shayne Longpre, Kartik Perisetla, Anthony Chen, Nikhil Ramesh, Chris
DuBois, Sameer Singh
- Abstract summary: We formalize the problem of knowledge conflicts, where the contextual information contradicts the learned information.
We propose a method to mitigate over-reliance on parametric knowledge, which minimizes hallucination, and improves out-of-distribution generalization by 4%-7%.
Our findings demonstrate the importance for practitioners to evaluate model tendency to hallucinate rather than read, and show that our mitigation strategy encourages generalization to evolving information.
- Score: 29.973926661540524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge-dependent tasks typically use two sources of knowledge: parametric,
learned at training time, and contextual, given as a passage at inference time.
To understand how models use these sources together, we formalize the problem
of knowledge conflicts, where the contextual information contradicts the
learned information. Analyzing the behaviour of popular models, we measure
their over-reliance on memorized information (the cause of hallucinations), and
uncover important factors that exacerbate this behaviour. Lastly, we propose a
simple method to mitigate over-reliance on parametric knowledge, which
minimizes hallucination, and improves out-of-distribution generalization by
4%-7%. Our findings demonstrate the importance for practitioners to evaluate
model tendency to hallucinate rather than read, and show that our mitigation
strategy encourages generalization to evolving information (i.e.,
time-dependent queries). To encourage these practices, we have released our
framework for generating knowledge conflicts.
Related papers
- Mitigating Knowledge Conflicts in Language Model-Driven Question Answering [15.29366851382021]
In this work, we argue that hallucination could be mitigated via explicit correlation between input source and generated content.
We focus on a typical example of hallucination, entity-based knowledge conflicts in question answering, where correlation of entities and their description at training time hinders model behaviour during inference.
arXiv Detail & Related papers (2024-11-18T07:33:10Z) - Crystal: Introspective Reasoners Reinforced with Self-Feedback [118.53428015478957]
We propose a novel method to develop an introspective commonsense reasoner, Crystal.
To tackle commonsense problems, it first introspects for knowledge statements related to the given question, and subsequently makes an informed prediction that is grounded in the previously introspected knowledge.
Experiments show that Crystal significantly outperforms both the standard supervised finetuning and chain-of-thought distilled methods, and enhances the transparency of the commonsense reasoning process.
arXiv Detail & Related papers (2023-10-07T21:23:58Z) - Towards a Rigorous Analysis of Mutual Information in Contrastive
Learning [3.6048794343841766]
We introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis.
Specifically, we investigate small batch size, mutual information as a measure, and the InfoMin principle.
arXiv Detail & Related papers (2023-08-30T01:59:42Z) - Leveraging Skill-to-Skill Supervision for Knowledge Tracing [13.753990664747265]
Knowledge tracing plays a pivotal role in intelligent tutoring systems.
Recent advances in knowledge tracing models have enabled better exploitation of problem solving history.
Knowledge tracing algorithms that incorporate knowledge directly are important to settings with limited data or cold starts.
arXiv Detail & Related papers (2023-06-12T03:23:22Z) - RECKONING: Reasoning through Dynamic Knowledge Encoding [51.076603338764706]
We show that language models can answer questions by reasoning over knowledge provided as part of the context.
In these situations, the model fails to distinguish the knowledge that is necessary to answer the question.
We propose teaching the model to reason more robustly by folding the provided contextual knowledge into the model's parameters.
arXiv Detail & Related papers (2023-05-10T17:54:51Z) - Investigating Forgetting in Pre-Trained Representations Through
Continual Learning [51.30807066570425]
We study the effect of representation forgetting on the generality of pre-trained language models.
We find that the generality is destructed in various pre-trained LMs, and syntactic and semantic knowledge is forgotten through continual learning.
arXiv Detail & Related papers (2023-05-10T08:27:59Z) - The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources
in Natural Language Understanding Systems [87.3207729953778]
We evaluate state-of-the-art coreference resolution models on our dataset.
Several models struggle to reason on-the-fly over knowledge observed both at pretrain time and at inference time.
Still, even the best performing models seem to have difficulties with reliably integrating knowledge presented only at inference time.
arXiv Detail & Related papers (2022-12-15T23:26:54Z) - Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating
Models to Reflect Conflicting Evidence [37.18100697469402]
We simulate knowledge conflicts where parametric knowledge suggests one answer and different passages suggest different answers.
We find retrieval performance heavily impacts which sources models rely on, and current models mostly rely on non-performing knowledge.
We present a new calibration study, where models are discouraged from presenting any single answer when presented with multiple conflicting answer candidates.
arXiv Detail & Related papers (2022-10-25T01:46:00Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.