Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking
- URL: http://arxiv.org/abs/2404.18971v1
- Date: Mon, 29 Apr 2024 13:47:04 GMT
- Title: Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking
- Authors: Zacharias Chrysidis, Stefanos-Iordanis Papadopoulos, Symeon Papadopoulos, Panagiotis C. Petrantonakis,
- Abstract summary: "CREDible, Unreliable or LEaked" dataset consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked)
"EVidence VERification Network (EVVER-Net) trained on CREDULE to detect leaked and unreliable evidence in both short and long texts"
"EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy"
- Score: 11.891881050619457
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Automated fact-checking (AFC) is garnering increasing attention by researchers aiming to help fact-checkers combat the increasing spread of misinformation online. While many existing AFC methods incorporate external information from the Web to help examine the veracity of claims, they often overlook the importance of verifying the source and quality of collected "evidence". One overlooked challenge involves the reliance on "leaked evidence", information gathered directly from fact-checking websites and used to train AFC systems, resulting in an unrealistic setting for early misinformation detection. Similarly, the inclusion of information from unreliable sources can undermine the effectiveness of AFC systems. To address these challenges, we present a comprehensive approach to evidence verification and filtering. We create the "CREDible, Unreliable or LEaked" (CREDULE) dataset, which consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked). Additionally, we introduce the EVidence VERification Network (EVVER-Net), trained on CREDULE to detect leaked and unreliable evidence in both short and long texts. EVVER-Net can be used to filter evidence collected from the Web, thus enhancing the robustness of end-to-end AFC systems. We experiment with various language models and show that EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy, while leveraging domain credibility scores along with short or long texts, respectively. Finally, we assess the evidence provided by widely-used fact-checking datasets including LIAR-PLUS, MOCHEG, FACTIFY, NewsCLIPpings+ and VERITE, some of which exhibit concerning rates of leaked and unreliable evidence.
Related papers
- VeraCT Scan: Retrieval-Augmented Fake News Detection with Justifiable Reasoning [13.711292329830169]
We introduce VeraCT Scan, a novel retrieval-augmented system for fake news detection.
This system operates by extracting the core facts from a given piece of news and subsequently conducting an internet-wide search to identify corroborating or conflicting reports.
We also provide transparent evidence and reasoning to support its conclusions, resulting in the interpretability and trust in the results.
arXiv Detail & Related papers (2024-06-12T21:23:48Z) - RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict [34.2739191920746]
High-quality evidence plays a vital role in enhancing fact-checking systems.
We propose a method based on a Large Language Model to automatically retrieve and summarize evidence from the Web.
We construct RU22Fact, a novel explainable fact-checking dataset on the Russia-Ukraine conflict in 2022 of 16K samples.
arXiv Detail & Related papers (2024-03-25T11:56:29Z) - Give Me More Details: Improving Fact-Checking with Latent Retrieval [58.706972228039604]
Evidence plays a crucial role in automated fact-checking.
Existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine.
We propose to incorporate full text from source documents as evidence and introduce two enriched datasets.
arXiv Detail & Related papers (2023-05-25T15:01:19Z) - Read it Twice: Towards Faithfully Interpretable Fact Verification by
Revisiting Evidence [59.81749318292707]
We propose a fact verification model named ReRead to retrieve evidence and verify claim.
The proposed system is able to achieve significant improvements upon best-reported models under different settings.
arXiv Detail & Related papers (2023-05-02T03:23:14Z) - Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for
Misinformation [67.69725605939315]
Misinformation emerges in times of uncertainty when credible information is limited.
This is challenging for NLP-based fact-checking as it relies on counter-evidence, which may not yet be available.
arXiv Detail & Related papers (2022-10-25T09:40:48Z) - CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking [55.75590135151682]
CHEF is the first CHinese Evidence-based Fact-checking dataset of 10K real-world claims.
The dataset covers multiple domains, ranging from politics to public health, and provides annotated evidence retrieved from the Internet.
arXiv Detail & Related papers (2022-06-06T09:11:03Z) - Synthetic Disinformation Attacks on Automated Fact Verification Systems [53.011635547834025]
We explore the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings.
We show that these systems suffer significant performance drops against these attacks.
We discuss the growing threat of modern NLG systems as generators of disinformation.
arXiv Detail & Related papers (2022-02-18T19:01:01Z) - FaVIQ: FAct Verification from Information-seeking Questions [77.7067957445298]
We construct a large-scale fact verification dataset called FaVIQ using information-seeking questions posed by real users.
Our claims are verified to be natural, contain little lexical bias, and require a complete understanding of the evidence for verification.
arXiv Detail & Related papers (2021-07-05T17:31:44Z) - Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence [32.63174559281556]
VitaminC is a benchmark infused with challenging cases that require fact verification models to discern and adjust to slight factual changes.
We collect over 100,000 Wikipedia revisions that modify an underlying fact, and leverage these revisions to create over 400,000 claim-evidence pairs.
We show that training using this design increases robustness -- improving accuracy by 10% on adversarial fact verification and 6% on adversarial natural language inference.
arXiv Detail & Related papers (2021-03-15T17:05:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.