(Fact) Check Your Bias
- URL: http://arxiv.org/abs/2506.21745v1
- Date: Thu, 26 Jun 2025 20:03:58 GMT
- Title: (Fact) Check Your Bias
- Authors: Eivind Morris Bakke, Nora Winger Heggelund,
- Abstract summary: We investigate how parametric knowledge biases affect fact-checking outcomes of the HerO system (baseline for FEVER-25)<n>When prompted directly to perform fact-verification, Llama 3.1 labels nearly half the claims as "Not Enough Evidence"<n>In the second experiment, we prompt the model to generate supporting, refuting, or neutral fact-checking documents. These prompts significantly influence retrieval outcomes, with approximately 50% of retrieved evidence being unique to each perspective.<n>Despite differences in retrieved evidence, final verdict predictions show stability across prompting strategies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic fact verification systems increasingly rely on large language models (LLMs). We investigate how parametric knowledge biases in these models affect fact-checking outcomes of the HerO system (baseline for FEVER-25). We examine how the system is affected by: (1) potential bias in Llama 3.1's parametric knowledge and (2) intentionally injected bias. When prompted directly to perform fact-verification, Llama 3.1 labels nearly half the claims as "Not Enough Evidence". Using only its parametric knowledge it is able to reach a verdict on the remaining half of the claims. In the second experiment, we prompt the model to generate supporting, refuting, or neutral fact-checking documents. These prompts significantly influence retrieval outcomes, with approximately 50\% of retrieved evidence being unique to each perspective. Notably, the model sometimes refuses to generate supporting documents for claims it believes to be false, creating an inherent negative bias. Despite differences in retrieved evidence, final verdict predictions show stability across prompting strategies. The code is available at: https://github.com/eibakke/FEVER-8-Shared-Task
Related papers
- The Missing Parts: Augmenting Fact Verification with Half-Truth Detection [8.080157788477347]
Many real-world claims are half-truths, factually correct yet misleading due to the omission of critical context.<n>We introduce the task of half-truth detection, and propose PolitiFact-Hidden, a new benchmark with 15k political claims annotated with sentence-level evidence alignment and inferred claim intent.<n>We present TRACER, a modular re-assessment framework that identifies omission-based misinformation by aligning evidence, inferring implied intent, and estimating the causal impact of hidden content.
arXiv Detail & Related papers (2025-08-01T10:06:38Z) - Retrieving Versus Understanding Extractive Evidence in Few-Shot Learning [4.230202411425062]
We analyze the relationship between the retrieval and interpretation of within-document evidence for large language model.<n>We perform two ablation studies to investigate when both label prediction and evidence retrieval errors can be attributed to qualities of the relevant evidence.
arXiv Detail & Related papers (2025-02-19T20:48:09Z) - Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction [56.17020601803071]
Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
arXiv Detail & Related papers (2024-03-15T02:04:35Z) - Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door
Adjustment [27.455646975256986]
Causal Walk is a novel method for debiasing multi-hop fact verification from a causal perspective.
Results show that Causal Walk outperforms some previous debiasing methods on both existing datasets and newly constructed datasets.
arXiv Detail & Related papers (2024-03-05T06:28:02Z) - Quantifying Bias in Text-to-Image Generative Models [49.60774626839712]
Bias in text-to-image (T2I) models can propagate unfair social representations and may be used to aggressively market ideas or push controversial agendas.
Existing T2I model bias evaluation methods only focus on social biases.
We propose an evaluation methodology to quantify general biases in T2I generative models, without any preconceived notions.
arXiv Detail & Related papers (2023-12-20T14:26:54Z) - BeMap: Balanced Message Passing for Fair Graph Neural Network [50.910842893257275]
We show that message passing could amplify the bias when the 1-hop neighbors from different demographic groups are unbalanced.
We propose BeMap, a fair message passing method, that balances the number of the 1-hop neighbors of each node among different demographic groups.
arXiv Detail & Related papers (2023-06-07T02:16:36Z) - Give Me More Details: Improving Fact-Checking with Latent Retrieval [58.706972228039604]
Evidence plays a crucial role in automated fact-checking.
Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine.
We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
arXiv Detail & Related papers (2023-05-25T15:01:19Z) - Read it Twice: Towards Faithfully Interpretable Fact Verification by
Revisiting Evidence [59.81749318292707]
We propose a fact verification model named ReRead to retrieve evidence and verify claim.
The proposed system is able to achieve significant improvements upon best-reported models under different settings.
arXiv Detail & Related papers (2023-05-02T03:23:14Z) - Automatic Fake News Detection: Are Models Learning to Reason? [9.143551270841858]
We investigate the relationship and importance of both claim and evidence.
Surprisingly, we find on political fact checking datasets that most often the highest effectiveness is obtained by utilizing only the evidence.
This highlights an important problem in what constitutes evidence in existing approaches for automatic fake news detection.
arXiv Detail & Related papers (2021-05-17T09:34:03Z) - AmbiFC: Fact-Checking Ambiguous Claims with Evidence [57.7091560922174]
We present AmbiFC, a fact-checking dataset with 10k claims derived from real-world information needs.
We analyze disagreements arising from ambiguity when comparing claims against evidence in AmbiFC.
We develop models for predicting veracity handling this ambiguity via soft labels.
arXiv Detail & Related papers (2021-04-01T17:40:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.