A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple Languages
- URL: http://arxiv.org/abs/2504.20668v1
- Date: Tue, 29 Apr 2025 11:49:05 GMT
- Title: A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple Languages
- Authors: Ivan Vykopal, Martin Hyben, Robert Moro, Michal Gregor, Jakub Simko,
- Abstract summary: This research introduces an approach that retrieves previously fact-checked claims, evaluates their relevance to a given input, and provides supplementary information to support fact-checkers.<n>Our method employs large language models (LLMs) to filter irrelevant fact-checks and generate concise summaries and explanations.<n>Our results demonstrate that LLMs are able to filter out many irrelevant fact-checks and, therefore, reduce effort and streamline the fact-checking process.
- Score: 1.3331869040581863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online disinformation poses a global challenge, placing significant demands on fact-checkers who must verify claims efficiently to prevent the spread of false information. A major issue in this process is the redundant verification of already fact-checked claims, which increases workload and delays responses to newly emerging claims. This research introduces an approach that retrieves previously fact-checked claims, evaluates their relevance to a given input, and provides supplementary information to support fact-checkers. Our method employs large language models (LLMs) to filter irrelevant fact-checks and generate concise summaries and explanations, enabling fact-checkers to faster assess whether a claim has been verified before. In addition, we evaluate our approach through both automatic and human assessments, where humans interact with the developed tool to review its effectiveness. Our results demonstrate that LLMs are able to filter out many irrelevant fact-checks and, therefore, reduce effort and streamline the fact-checking process.
Related papers
- Fact-checking AI-generated news reports: Can LLMs catch their own lies? [4.232709762282742]
We evaluate whether Large Language Models (LLMs) can effectively fact-check their own content.
LLMs are more effective at assessing claims in national or international news stories than in local news stories.
We find that incorporating retrieved results from a search engine in a Retrieval-Augmented Generation setting significantly reduces the number of claims an LLM cannot assess.
arXiv Detail & Related papers (2025-03-24T02:32:02Z) - Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking [43.300457630671154]
Large language models and generative AI in online media have amplified the need for effective automated fact-checking.<n>It is unclear how these explanations should align with the decision-making and reasoning processes of fact-checkers.
arXiv Detail & Related papers (2025-02-13T08:56:25Z) - Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies [66.30619782227173]
Large language models (LLMs) can produce erroneous responses that sound fluent and convincing.
We identify several features of LLM responses that shape users' reliance.
We find that explanations increase reliance on both correct and incorrect responses.
We observe less reliance on incorrect responses when sources are provided or when explanations exhibit inconsistencies.
arXiv Detail & Related papers (2025-02-12T16:35:41Z) - FactLens: Benchmarking Fine-Grained Fact Verification [6.814173254027381]
We advocate for a shift toward fine-grained verification, where complex claims are broken down into smaller sub-claims for individual verification.<n>We introduce FactLens, a benchmark for evaluating fine-grained fact verification, with metrics and automated evaluators of sub-claim quality.<n>Our results show alignment between automated FactLens evaluators and human judgments, and we discuss the impact of sub-claim characteristics on the overall verification performance.
arXiv Detail & Related papers (2024-11-08T21:26:57Z) - Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims.
We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents.
We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z) - FactFinders at CheckThat! 2024: Refining Check-worthy Statement Detection with LLMs through Data Pruning [43.82613670331329]
This study investigates the application of open-source language models to identify check-worthy statements from political transcriptions.
We propose a two-step data pruning approach to automatically identify high-quality training data instances for effective learning.
Our team ranked first in the check-worthiness estimation task in the English language.
arXiv Detail & Related papers (2024-06-26T12:31:31Z) - From Chaos to Clarity: Claim Normalization to Empower Fact-Checking [57.024192702939736]
Claim Normalization (aka ClaimNorm) aims to decompose complex and noisy social media posts into more straightforward and understandable forms.
We propose CACN, a pioneering approach that leverages chain-of-thought and claim check-worthiness estimation.
Our experiments demonstrate that CACN outperforms several baselines across various evaluation measures.
arXiv Detail & Related papers (2023-10-22T16:07:06Z) - The Perils & Promises of Fact-checking with Large Language Models [55.869584426820715]
Large Language Models (LLMs) are increasingly trusted to write academic papers, lawsuits, and news articles.
We evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions.
Our results show the enhanced prowess of LLMs when equipped with contextual information.
While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy.
arXiv Detail & Related papers (2023-10-20T14:49:47Z) - Synthetic Disinformation Attacks on Automated Fact Verification Systems [53.011635547834025]
We explore the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings.
We show that these systems suffer significant performance drops against these attacks.
We discuss the growing threat of modern NLG systems as generators of disinformation.
arXiv Detail & Related papers (2022-02-18T19:01:01Z) - Generating Fact Checking Explanations [52.879658637466605]
A crucial piece of the puzzle that is still missing is to understand how to automate the most elaborate part of the process.
This paper provides the first study of how these explanations can be generated automatically based on available claim context.
Our results indicate that optimising both objectives at the same time, rather than training them separately, improves the performance of a fact checking system.
arXiv Detail & Related papers (2020-04-13T05:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.