Related papers: ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM

ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM

URL: http://arxiv.org/abs/2408.12076v1
Date: Thu, 22 Aug 2024 02:33:13 GMT
Title: ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM
Authors: Zhaochen Su, Jun Zhang, Xiaoye Qu, Tong Zhu, Yanshu Li, Jiashuo Sun, Juntao Li, Min Zhang, Yu Cheng,
Abstract summary: Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts has rarely been studied. We present ConflictBank, the first comprehensive benchmark developed to evaluate knowledge conflicts from three aspects. Our investigation delves into four model families and twelve LLM instances, meticulously analyzing conflicts stemming from misinformation, temporal discrepancies, and semantic divergences.
Score: 36.332500824079844
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts, a major source of hallucinations, has rarely been studied. Only a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge. However, a thorough assessment of knowledge conflict in LLMs is still missing. Motivated by this research gap, we present ConflictBank, the first comprehensive benchmark developed to systematically evaluate knowledge conflicts from three aspects: (i) conflicts encountered in retrieved knowledge, (ii) conflicts within the models' encoded knowledge, and (iii) the interplay between these conflict forms. Our investigation delves into four model families and twelve LLM instances, meticulously analyzing conflicts stemming from misinformation, temporal discrepancies, and semantic divergences. Based on our proposed novel construction framework, we create 7,453,853 claim-evidence pairs and 553,117 QA pairs. We present numerous findings on model scale, conflict causes, and conflict types. We hope our ConflictBank benchmark will help the community better understand model behavior in conflicts and develop more reliable LLMs.

Related papers

Conflicts in Texts: Data, Implications and Challenges [58.03478157713084]
Conflicts could reflect the complexity of situations, changes that need to be explained and dealt with, difficulties in data annotation, and mistakes in generated outputs. This survey categorizes these conflicts into three key areas: (1) natural texts on the web, where factual inconsistencies, subjective biases, and multiple perspectives introduce contradictions; (2) human-annotated data, where annotator disagreements, mistakes, and societal biases impact model training; and (3) model interactions, where hallucinations and knowledge conflicts emerge during deployment. We highlight key challenges and future directions for developing conflict-aware NLP systems that can reason over and reconcile conflicting information more effectively
arXiv Detail & Related papers (2025-04-28T04:24:01Z)
KOALA: Knowledge Conflict Augmentations for Robustness in Vision Language Models [6.52323086990482]
segsub is a framework that applies targeted perturbations to image sources to study and improve the robustness of vision language models. Contrary to prior findings, we find VLMs are largely robust to image perturbation. We find a link between hallucinations and image context, with GPT-4o prone to hallucination when presented with highly contextualized counterfactual examples.
arXiv Detail & Related papers (2025-02-19T00:26:38Z)
Is Cognition consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding [15.828455477224516]
As a multimodal task, document understanding requires models to possess both perceptual and cognitive abilities. In this paper, we define the conflicts between cognition and perception as Cognition and Perception (C&P) knowledge conflicts. We propose a novel method called Multimodal Knowledge Consistency Fine-tuning to mitigate the C&P knowledge conflicts.
arXiv Detail & Related papers (2024-11-12T11:28:50Z)
Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs [55.74117540987519]
This paper explores the problem of commonsense-level vision-knowledge conflict in Multimodal Large Language Models (MLLMs) We introduce an automated pipeline, augmented with human-in-the-loop quality control, to establish a benchmark aimed at simulating and assessing the conflicts in MLLMs. We evaluate the conflict-resolution capabilities of nine representative MLLMs across various model families and find a noticeable over-reliance on textual queries.
arXiv Detail & Related papers (2024-10-10T17:31:17Z)
ECon: On the Detection and Resolution of Evidence Conflicts [56.89209046429291]
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems. This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation scenarios.
arXiv Detail & Related papers (2024-10-05T07:41:17Z)
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge [57.66282463340297]
Knowledge conflict arises from discrepancies between information in the context of a large language model (LLM) and the knowledge stored in its parameters. We propose a fine-grained, instance-level approach called AdaCAD, which dynamically infers the weight of adjustment based on the degree of conflict.
arXiv Detail & Related papers (2024-09-11T16:35:18Z)
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models [51.72963030032491]
Knowledge documents for large language models (LLMs) may conflict with the memory of LLMs due to outdated or incorrect knowledge. We construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering.
arXiv Detail & Related papers (2024-04-04T16:40:11Z)
Knowledge Conflicts for LLMs: A Survey [24.731074825915833]
Survey focuses on three categories of knowledge conflicts: context-memory, inter-context, and intra-memory conflict. These conflicts can significantly impact the trustworthiness and performance of large language models.
arXiv Detail & Related papers (2024-03-13T08:02:23Z)
Resolving Knowledge Conflicts in Large Language Models [46.903549751371415]
Large language models (LLMs) often encounter knowledge conflicts. We ask what are the desiderata for LLMs when a knowledge conflict arises and whether existing LLMs fulfill them. We introduce an evaluation framework for simulating contextual knowledge conflicts.
arXiv Detail & Related papers (2023-10-02T06:57:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.