Evaluating open-source Large Language Models for automated fact-checking
- URL: http://arxiv.org/abs/2503.05565v1
- Date: Fri, 07 Mar 2025 16:45:33 GMT
- Title: Evaluating open-source Large Language Models for automated fact-checking
- Authors: Nicolo' Fontana, Francesco Corso, Enrico Zuccolotto, Francesco Pierri,
- Abstract summary: Large Language Models (LLMs) have emerged as potential tools for automated fact-checking.<n>This study focuses on their ability to assess claims with different levels of contextual information.
- Score: 0.13499500088995461
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing prevalence of online misinformation has heightened the demand for automated fact-checking solutions. Large Language Models (LLMs) have emerged as potential tools for assisting in this task, but their effectiveness remains uncertain. This study evaluates the fact-checking capabilities of various open-source LLMs, focusing on their ability to assess claims with different levels of contextual information. We conduct three key experiments: (1) evaluating whether LLMs can identify the semantic relationship between a claim and a fact-checking article, (2) assessing models' accuracy in verifying claims when given a related fact-checking article, and (3) testing LLMs' fact-checking abilities when leveraging data from external knowledge sources such as Google and Wikipedia. Our results indicate that LLMs perform well in identifying claim-article connections and verifying fact-checked stories but struggle with confirming factual news, where they are outperformed by traditional fine-tuned models such as RoBERTa. Additionally, the introduction of external knowledge does not significantly enhance LLMs' performance, calling for more tailored approaches. Our findings highlight both the potential and limitations of LLMs in automated fact-checking, emphasizing the need for further refinements before they can reliably replace human fact-checkers.
Related papers
- Fact-checking AI-generated news reports: Can LLMs catch their own lies? [4.232709762282742]
We evaluate whether Large Language Models (LLMs) can effectively fact-check their own content.
LLMs are more effective at assessing claims in national or international news stories than in local news stories.
We find that incorporating retrieved results from a search engine in a Retrieval-Augmented Generation setting significantly reduces the number of claims an LLM cannot assess.
arXiv Detail & Related papers (2025-03-24T02:32:02Z) - Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis [78.07225438556203]
We introduce LLM-Oasis, the largest resource for training end-to-end factuality evaluators.
It is constructed by extracting claims from Wikipedia, falsifying a subset of these claims, and generating pairs of factual and unfactual texts.
We then rely on human annotators to both validate the quality of our dataset and to create a gold standard test set for factuality evaluation systems.
arXiv Detail & Related papers (2024-11-29T12:21:15Z) - LEAF: Learning and Evaluation Augmented by Fact-Checking to Improve Factualness in Large Language Models [11.453585039783901]
LEAF: Learning and Evaluation Augmented by Fact-Checking, is a novel approach designed to enhance the factual reliability of large language models (LLMs)
The first strategy, Fact-Check-Then-RAG, improves Retrieval-Augmented Generation (RAG) by incorporating fact-checking results to guide the retrieval process without updating model parameters.
The second strategy, Learning from Fact-Checks via Self-Training, involves supervised fine-tuning (SFT) on fact-checked responses or applying Simple Preference Optimization (SimPO) with fact-checking as a ranking mechanism.
arXiv Detail & Related papers (2024-10-31T00:18:05Z) - Are Large Language Models Good Fact Checkers: A Preliminary Study [26.023148371263012]
Large Language Models (LLMs) have drawn significant attention due to their outstanding reasoning capabilities and extensive knowledge repository.
This study aims to comprehensively evaluate various LLMs in tackling specific fact-checking subtasks.
arXiv Detail & Related papers (2023-11-29T05:04:52Z) - RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z) - The Perils & Promises of Fact-checking with Large Language Models [55.869584426820715]
Large Language Models (LLMs) are increasingly trusted to write academic papers, lawsuits, and news articles.
We evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions.
Our results show the enhanced prowess of LLMs when equipped with contextual information.
While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy.
arXiv Detail & Related papers (2023-10-20T14:49:47Z) - Survey on Factuality in Large Language Models: Knowledge, Retrieval and
Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs)
As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking.
This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.