FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs
- URL: http://arxiv.org/abs/2402.05904v1
- Date: Thu, 8 Feb 2024 18:43:05 GMT
- Title: FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs
- Authors: Eun Cheol Choi, Emilio Ferrara
- Abstract summary: FACT-GPT identifies social media content that aligns with, contradicts, or is irrelevant to previously debunked claims.
Our evaluation shows that our specialized LLMs can match the accuracy of larger models in identifying related claims.
- Score: 11.323961700172175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our society is facing rampant misinformation harming public health and trust.
To address the societal challenge, we introduce FACT-GPT, a system leveraging
Large Language Models (LLMs) to automate the claim matching stage of
fact-checking. FACT-GPT, trained on a synthetic dataset, identifies social
media content that aligns with, contradicts, or is irrelevant to previously
debunked claims. Our evaluation shows that our specialized LLMs can match the
accuracy of larger models in identifying related claims, closely mirroring
human judgment. This research provides an automated solution for efficient
claim matching, demonstrates the potential of LLMs in supporting fact-checkers,
and offers valuable resources for further research in the field.
Related papers
- Towards Automated Fact-Checking of Real-World Claims: Exploring Task Formulation and Assessment with LLMs [32.45604456988931]
This study establishes baseline comparisons for Automated Fact-Checking (AFC) using Large Language Models (LLMs)
We evaluate Llama-3 models of varying sizes on 17,856 claims collected from PolitiFact (2007-2024) using evidence retrieved via restricted web searches.
Our results show that larger LLMs consistently outperform smaller LLMs in classification accuracy and justification quality without fine-tuning.
arXiv Detail & Related papers (2025-02-13T02:51:17Z) - Preference Leakage: A Contamination Problem in LLM-as-a-judge [69.96778498636071]
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods.
In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators.
arXiv Detail & Related papers (2025-02-03T17:13:03Z) - Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification [0.0]
This study evaluates the efficacy of Large Language Models (LLMs) as innovative solutions for mitigating misinformation on platforms like Twitter.
LLMs offer a pre-trained, adaptable approach that bypasses the extensive training and overfitting issues associated with traditional machine learning models.
We present a comparative analysis of LLMs' performance using a specialized dataset and propose a framework for their application in public health communication.
arXiv Detail & Related papers (2024-12-21T05:02:26Z) - Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing [2.936331223824117]
Large Language Models (LLMs) for automated text annotation in social media posts has garnered significant interest.
We analyze the performance of eight open-source and proprietary LLMs for annotating the stance expressed in social media posts.
A significant finding of our study is that the explicitness of text expressing a stance plays a critical role in how faithfully LLMs' stance judgments match humans'
arXiv Detail & Related papers (2024-06-11T17:26:07Z) - Missci: Reconstructing Fallacies in Misrepresented Science [84.32990746227385]
Health-related misinformation on social networks can lead to poor decision-making and real-world dangers.
Missci is a novel argumentation theoretical model for fallacious reasoning.
We present Missci as a dataset to test the critical reasoning abilities of large language models.
arXiv Detail & Related papers (2024-06-05T12:11:10Z) - CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models [60.59638232596912]
We introduce CLAMBER, a benchmark for evaluating large language models (LLMs)
Building upon the taxonomy, we construct 12K high-quality data to assess the strengths, weaknesses, and potential risks of various off-the-shelf LLMs.
Our findings indicate the limited practical utility of current LLMs in identifying and clarifying ambiguous user queries.
arXiv Detail & Related papers (2024-05-20T14:34:01Z) - Mitigating Large Language Model Hallucinations via Autonomous Knowledge
Graph-based Retrofitting [51.7049140329611]
This paper proposes Knowledge Graph-based Retrofitting (KGR) to mitigate factual hallucination during the reasoning process.
Experiments show that KGR can significantly improve the performance of LLMs on factual QA benchmarks.
arXiv Detail & Related papers (2023-11-22T11:08:38Z) - The Perils & Promises of Fact-checking with Large Language Models [55.869584426820715]
Large Language Models (LLMs) are increasingly trusted to write academic papers, lawsuits, and news articles.
We evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions.
Our results show the enhanced prowess of LLMs when equipped with contextual information.
While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy.
arXiv Detail & Related papers (2023-10-20T14:49:47Z) - ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases.
We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets.
Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z) - Automated Claim Matching with Large Language Models: Empowering
Fact-Checkers in the Fight Against Misinformation [11.323961700172175]
FACT-GPT is a framework designed to automate the claim matching phase of fact-checking using Large Language Models.
This framework identifies new social media content that either supports or contradicts claims previously debunked by fact-checkers.
We evaluated FACT-GPT on an extensive dataset of social media content related to public health.
arXiv Detail & Related papers (2023-10-13T16:21:07Z) - Assessing Hidden Risks of LLMs: An Empirical Study on Robustness,
Consistency, and Credibility [37.682136465784254]
We conduct over a million queries to the mainstream large language models (LLMs) including ChatGPT, LLaMA, and OPT.
We find that ChatGPT is still capable to yield the correct answer even when the input is polluted at an extreme level.
We propose a novel index associated with a dataset that roughly decides the feasibility of using such data for LLM-involved evaluation.
arXiv Detail & Related papers (2023-05-15T15:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.