Automated Claim Matching with Large Language Models: Empowering
Fact-Checkers in the Fight Against Misinformation
- URL: http://arxiv.org/abs/2310.09223v1
- Date: Fri, 13 Oct 2023 16:21:07 GMT
- Title: Automated Claim Matching with Large Language Models: Empowering
Fact-Checkers in the Fight Against Misinformation
- Authors: Eun Cheol Choi and Emilio Ferrara
- Abstract summary: FACT-GPT is a framework designed to automate the claim matching phase of fact-checking using Large Language Models.
This framework identifies new social media content that either supports or contradicts claims previously debunked by fact-checkers.
We evaluated FACT-GPT on an extensive dataset of social media content related to public health.
- Score: 11.323961700172175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In today's digital era, the rapid spread of misinformation poses threats to
public well-being and societal trust. As online misinformation proliferates,
manual verification by fact checkers becomes increasingly challenging. We
introduce FACT-GPT (Fact-checking Augmentation with Claim matching
Task-oriented Generative Pre-trained Transformer), a framework designed to
automate the claim matching phase of fact-checking using Large Language Models
(LLMs). This framework identifies new social media content that either supports
or contradicts claims previously debunked by fact-checkers. Our approach
employs GPT-4 to generate a labeled dataset consisting of simulated social
media posts. This data set serves as a training ground for fine-tuning more
specialized LLMs. We evaluated FACT-GPT on an extensive dataset of social media
content related to public health. The results indicate that our fine-tuned LLMs
rival the performance of larger pre-trained LLMs in claim matching tasks,
aligning closely with human annotations. This study achieves three key
milestones: it provides an automated framework for enhanced fact-checking;
demonstrates the potential of LLMs to complement human expertise; offers public
resources, including datasets and models, to further research and applications
in the fact-checking domain.
Related papers
- FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs [11.323961700172175]
FACT-GPT identifies social media content that aligns with, contradicts, or is irrelevant to previously debunked claims.
Our evaluation shows that our specialized LLMs can match the accuracy of larger models in identifying related claims.
arXiv Detail & Related papers (2024-02-08T18:43:05Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning [32.52811740662061]
This article introduces DP-LoRA, a novel federated learning algorithm tailored for large language models (LLMs)
DP-LoRA preserves data privacy by employing a Gaussian mechanism that adds noise in weight updates, maintaining individual data privacy while facilitating collaborative model training.
arXiv Detail & Related papers (2023-12-29T06:50:38Z) - Countering Misinformation via Emotional Response Generation [15.383062216223971]
proliferation of misinformation on social media platforms (SMPs) poses a significant danger to public health, social cohesion and democracy.
Previous research has shown how social correction can be an effective way to curb misinformation.
We present VerMouth, the first large-scale dataset comprising roughly 12 thousand claim-response pairs.
arXiv Detail & Related papers (2023-11-17T15:37:18Z) - A Survey on Detection of LLMs-Generated Content [97.87912800179531]
The ability to detect LLMs-generated content has become of paramount importance.
We aim to provide a detailed overview of existing detection strategies and benchmarks.
We also posit the necessity for a multi-faceted approach to defend against various attacks.
arXiv Detail & Related papers (2023-10-24T09:10:26Z) - The Perils & Promises of Fact-checking with Large Language Models [55.869584426820715]
Large Language Models (LLMs) are increasingly trusted to write academic papers, lawsuits, and news articles.
We evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions.
Our results show the enhanced prowess of LLMs when equipped with contextual information.
While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy.
arXiv Detail & Related papers (2023-10-20T14:49:47Z) - Balanced and Explainable Social Media Analysis for Public Health with
Large Language Models [13.977401672173533]
Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs)
To tackle these challenges, the data imbalance issue can be overcome by sophisticated data augmentation methods for social media datasets.
In this paper, a novel ALEX framework is proposed for social media analysis on public health.
arXiv Detail & Related papers (2023-09-12T04:15:34Z) - Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking.
This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - ChatGPT as your Personal Data Scientist [0.9689893038619583]
This paper introduces a ChatGPT-based conversational data-science framework to act as a "personal data scientist"
Our model pivots around four dialogue states: Data visualization, Task Formulation, Prediction Engineering, and Result Summary and Recommendation.
In summary, we developed an end-to-end system that not only proves the viability of the novel concept of conversational data science but also underscores the potency of LLMs in solving complex tasks.
arXiv Detail & Related papers (2023-05-23T04:00:16Z) - FacTeR-Check: Semi-automated fact-checking through Semantic Similarity
and Natural Language Inference [61.068947982746224]
FacTeR-Check enables retrieving fact-checked information, unchecked claims verification and tracking dangerous information over social media.
The architecture is validated using a new dataset called NLI19-SP that is publicly released with COVID-19 related hoaxes and tweets from Spanish social media.
Our results show state-of-the-art performance on the individual benchmarks, as well as producing useful analysis of the evolution over time of 61 different hoaxes.
arXiv Detail & Related papers (2021-10-27T15:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.