KOALA: Knowledge Conflict Augmentations for Robustness in Vision Language Models
- URL: http://arxiv.org/abs/2502.14908v1
- Date: Wed, 19 Feb 2025 00:26:38 GMT
- Title: KOALA: Knowledge Conflict Augmentations for Robustness in Vision Language Models
- Authors: Peter Carragher, Nikitha Rao, Abhinand Jha, R Raghav, Kathleen M. Carley,
- Abstract summary: segsub is a framework that applies targeted perturbations to image sources to study and improve the robustness of vision language models.<n>Contrary to prior findings, we find VLMs are largely robust to image perturbation.<n>We find a link between hallucinations and image context, with GPT-4o prone to hallucination when presented with highly contextualized counterfactual examples.
- Score: 6.52323086990482
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The robustness of large language models (LLMs) against knowledge conflicts in unimodal question answering systems has been well studied. However, the effect of conflicts in information sources on vision language models (VLMs) in multimodal settings has not yet been explored. In this work, we propose \segsub, a framework that applies targeted perturbations to image sources to study and improve the robustness of VLMs against three different types of knowledge conflicts, namely parametric, source, and counterfactual conflicts. Contrary to prior findings that showed that LLMs are sensitive to parametric conflicts arising from textual perturbations, we find VLMs are largely robust to image perturbation. On the other hand, VLMs perform poorly on counterfactual examples (<30% accuracy) and fail to reason over source conflicts (<1% accuracy). We also find a link between hallucinations and image context, with GPT-4o prone to hallucination when presented with highly contextualized counterfactual examples. While challenges persist with source conflicts, finetuning models significantly improves reasoning over counterfactual samples. Our findings highlight the need for VLM training methodologies that enhance their reasoning capabilities, particularly in addressing complex knowledge conflicts between multimodal sources.
Related papers
- Conflicts in Texts: Data, Implications and Challenges [58.03478157713084]
Conflicts could reflect the complexity of situations, changes that need to be explained and dealt with, difficulties in data annotation, and mistakes in generated outputs.
This survey categorizes these conflicts into three key areas: (1) natural texts on the web, where factual inconsistencies, subjective biases, and multiple perspectives introduce contradictions; (2) human-annotated data, where annotator disagreements, mistakes, and societal biases impact model training; and (3) model interactions, where hallucinations and knowledge conflicts emerge during deployment.
We highlight key challenges and future directions for developing conflict-aware NLP systems that can reason over and reconcile conflicting information more effectively
arXiv Detail & Related papers (2025-04-28T04:24:01Z) - Retrieval-Augmented Generation with Conflicting Evidence [57.66282463340297]
Large language model (LLM) agents are increasingly employing retrieval-augmented generation (RAG) to improve the factuality of their responses.
In practice, these systems often need to handle ambiguous user queries and potentially conflicting information from multiple sources.
We propose RAMDocs (Retrieval with Ambiguity and Misinformation in Documents), a new dataset that simulates complex and realistic scenarios for conflicting evidence for a user query.
arXiv Detail & Related papers (2025-04-17T16:46:11Z) - Breaking Focus: Contextual Distraction Curse in Large Language Models [68.4534308805202]
We investigate a critical vulnerability in Large Language Models (LLMs)<n>This phenomenon arises when models fail to maintain consistent performance on questions modified with semantically coherent but irrelevant context.<n>We propose an efficient tree-based search methodology to automatically generate CDV examples.
arXiv Detail & Related papers (2025-02-03T18:43:36Z) - Analysing the Residual Stream of Language Models Under Knowledge Conflicts [23.96385393039587]
Large language models (LLMs) can store a significant amount of factual knowledge in their parameters.
However, their parametric knowledge may conflict with the information provided in the context.
This can lead to undesirable model behaviour, such as reliance on outdated or incorrect information.
arXiv Detail & Related papers (2024-10-21T15:12:51Z) - Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs [55.74117540987519]
This paper explores the problem of commonsense-level vision-knowledge conflict in Multimodal Large Language Models (MLLMs)
We introduce an automated pipeline, augmented with human-in-the-loop quality control, to establish a benchmark aimed at simulating and assessing the conflicts in MLLMs.
We evaluate the conflict-resolution capabilities of nine representative MLLMs across various model families and find a noticeable over-reliance on textual queries.
arXiv Detail & Related papers (2024-10-10T17:31:17Z) - ECon: On the Detection and Resolution of Evidence Conflicts [56.89209046429291]
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems.
This study introduces a method for generating diverse, validated evidence conflicts to simulate real-world misinformation scenarios.
arXiv Detail & Related papers (2024-10-05T07:41:17Z) - Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models [33.76903352835436]
Large Vision-Language Models (LVLMs) have demonstrated impressive capabilities for capturing and reasoning over multimodal inputs.
These models are prone to parametric knowledge conflicts, which arise from inconsistencies of represented knowledge between their vision and language components.
We present a systematic approach to detect, interpret, and mitigate them.
arXiv Detail & Related papers (2024-10-04T17:59:28Z) - AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge [57.66282463340297]
Knowledge conflict arises from discrepancies between information in the context of a large language model (LLM) and the knowledge stored in its parameters.
We propose a fine-grained, instance-level approach called AdaCAD, which dynamically infers the weight of adjustment based on the degree of conflict.
arXiv Detail & Related papers (2024-09-11T16:35:18Z) - ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM [36.332500824079844]
Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts has rarely been studied.
We present ConflictBank, the first comprehensive benchmark developed to evaluate knowledge conflicts from three aspects.
Our investigation delves into four model families and twelve LLM instances, meticulously analyzing conflicts stemming from misinformation, temporal discrepancies, and semantic divergences.
arXiv Detail & Related papers (2024-08-22T02:33:13Z) - Resolving Knowledge Conflicts in Large Language Models [46.903549751371415]
Large language models (LLMs) often encounter knowledge conflicts.
We ask what are the desiderata for LLMs when a knowledge conflict arises and whether existing LLMs fulfill them.
We introduce an evaluation framework for simulating contextual knowledge conflicts.
arXiv Detail & Related papers (2023-10-02T06:57:45Z) - Trusting Your Evidence: Hallucinate Less with Context-aware Decoding [91.91468712398385]
Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations.
We present context-aware decoding (CAD), which follows a contrastive output distribution that amplifies the difference between the probabilities when a model is used with and without context.
arXiv Detail & Related papers (2023-05-24T05:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.