Transforming Fake News: Robust Generalisable News Classification Using
Transformers
- URL: http://arxiv.org/abs/2109.09796v1
- Date: Mon, 20 Sep 2021 19:03:16 GMT
- Title: Transforming Fake News: Robust Generalisable News Classification Using
Transformers
- Authors: Ciara Blackledge and Amir Atapour-Abarghouei
- Abstract summary: Using the publicly available ISOT and Combined Corpus datasets, this study explores transformers' abilities to identify fake news.
We propose a novel two-step classification pipeline to remove such articles from both model training and the final deployed inference system.
Experiments over the ISOT and Combined Corpus datasets show that transformers achieve an increase in F1 scores of up to 4.9% for out of distribution generalisation.
- Score: 8.147652597876862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As online news has become increasingly popular and fake news increasingly
prevalent, the ability to audit the veracity of online news content has become
more important than ever. Such a task represents a binary classification
challenge, for which transformers have achieved state-of-the-art results. Using
the publicly available ISOT and Combined Corpus datasets, this study explores
transformers' abilities to identify fake news, with particular attention given
to investigating generalisation to unseen datasets with varying styles, topics
and class distributions. Moreover, we explore the idea that opinion-based news
articles cannot be classified as real or fake due to their subjective nature
and often sensationalised language, and propose a novel two-step classification
pipeline to remove such articles from both model training and the final
deployed inference system. Experiments over the ISOT and Combined Corpus
datasets show that transformers achieve an increase in F1 scores of up to 4.9%
for out of distribution generalisation compared to baseline approaches, with a
further increase of 10.1% following the implementation of our two-step
classification pipeline. To the best of our knowledge, this study is the first
to investigate generalisation of transformers in this context.
Related papers
- VERITAS-NLI : Validation and Extraction of Reliable Information Through Automated Scraping and Natural Language Inference [0.0]
The rise of fake news poses an alarming threat to the integrity of public discourse, societal trust, and reputed news sources.
We propose our novel solution, leveraging web-scraping techniques and Natural Language Inference (NLI) models.
Our system is evaluated on a diverse self-curated evaluation dataset spanning over multiple news channels and broad domains.
arXiv Detail & Related papers (2024-10-12T09:25:12Z) - How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models [95.44559524735308]
Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content.
We test the limits of improving foundation model performance without continual updating through an initial study of knowledge transfer.
Our results on two recent multi-modal fact-checking benchmarks, Mocheg and Fakeddit, indicate that knowledge transfer strategies can improve Fakeddit performance over the state-of-the-art by up to 1.7% and Mocheg performance by up to 2.9%.
arXiv Detail & Related papers (2024-06-29T08:39:07Z) - FineFake: A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detection [54.37159298632628]
FineFake is a multi-domain knowledge-enhanced benchmark for fake news detection.
FineFake encompasses 16,909 data samples spanning six semantic topics and eight platforms.
The entire FineFake project is publicly accessible as an open-source repository.
arXiv Detail & Related papers (2024-03-30T14:39:09Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - MisRoB{\AE}RTa: Transformers versus Misinformation [0.6091702876917281]
We propose a novel transformer-based deep neural ensemble architecture for misinformation detection.
MisRoBAERTa takes advantage of two transformers (BART & RoBERTa) to improve the classification performance.
For training and testing, we used a large real-world news articles dataset labeled with 10 classes.
arXiv Detail & Related papers (2023-04-16T12:14:38Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Detecting COVID-19 Conspiracy Theories with Transformers and TF-IDF [2.3202611780303553]
We present our methods and results for three fake news detection tasks at MediaEval benchmark 2021.
We find that a pre-trained transformer yields the best validation results, but a randomly trained transformer with smart design can also be trained to reach accuracies close to that of the pre-trained transformer.
arXiv Detail & Related papers (2022-05-01T01:48:48Z) - A Fourier-based Framework for Domain Generalization [82.54650565298418]
Domain generalization aims at tackling this problem by learning transferable knowledge from multiple source domains in order to generalize to unseen target domains.
This paper introduces a novel Fourier-based perspective for domain generalization.
Experiments on three benchmarks have demonstrated that the proposed method is able to achieve state-of-the-arts performance for domain generalization.
arXiv Detail & Related papers (2021-05-24T06:50:30Z) - Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake
News Detection [7.29381091750894]
We propose a novel transformer-based language model fine-tuning approach for these fake news detection.
First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases.
Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations.
arXiv Detail & Related papers (2021-01-14T09:05:42Z) - Two Stage Transformer Model for COVID-19 Fake News Detection and Fact
Checking [0.3441021278275805]
We develop a two stage automated pipeline for COVID-19 fake news detection using state of the art machine learning models for natural language processing.
The first model leverages a novel fact checking algorithm that retrieves the most relevant facts concerning user claims about particular COVID-19 claims.
The second model verifies the level of truth in the claim by computing the textual entailment between the claim and the true facts retrieved from a manually curated COVID-19 dataset.
arXiv Detail & Related papers (2020-11-26T11:50:45Z) - Pretrained Transformers for Text Ranking: BERT and Beyond [53.83210899683987]
This survey provides an overview of text ranking with neural network architectures known as transformers.
The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing.
arXiv Detail & Related papers (2020-10-13T15:20:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.