VERITAS-NLI : Validation and Extraction of Reliable Information Through Automated Scraping and Natural Language Inference
- URL: http://arxiv.org/abs/2410.09455v1
- Date: Sat, 12 Oct 2024 09:25:12 GMT
- Title: VERITAS-NLI : Validation and Extraction of Reliable Information Through Automated Scraping and Natural Language Inference
- Authors: Arjun Shah, Hetansh Shah, Vedica Bafna, Charmi Khandor, Sindhu Nair,
- Abstract summary: The rise of fake news poses an alarming threat to the integrity of public discourse, societal trust, and reputed news sources.
We propose our novel solution, leveraging web-scraping techniques and Natural Language Inference (NLI) models.
Our system is evaluated on a diverse self-curated evaluation dataset spanning over multiple news channels and broad domains.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In today's day and age where information is rapidly spread through online platforms, the rise of fake news poses an alarming threat to the integrity of public discourse, societal trust, and reputed news sources. Classical machine learning and Transformer-based models have been extensively studied for the task of fake news detection, however they are hampered by their reliance on training data and are unable to generalize on unseen headlines. To address these challenges, we propose our novel solution, leveraging web-scraping techniques and Natural Language Inference (NLI) models to retrieve external knowledge necessary for verifying the accuracy of a headline. Our system is evaluated on a diverse self-curated evaluation dataset spanning over multiple news channels and broad domains. Our best performing pipeline achieves an accuracy of 84.3% surpassing the best classical Machine Learning model by 33.3% and Bidirectional Encoder Representations from Transformers (BERT) by 31.0% . This highlights the efficacy of combining dynamic web-scraping with Natural Language Inference to find support for a claimed headline in the corresponding externally retrieved knowledge for the task of fake news detection.
Related papers
- A Regularized LSTM Method for Detecting Fake News Articles [0.0]
This paper develops an advanced machine learning solution for detecting fake news articles.
We leverage a comprehensive dataset of news articles, including 23,502 fake news articles and 21,417 accurate news articles.
Our work highlights the potential for deploying such models in real-world applications.
arXiv Detail & Related papers (2024-11-16T05:54:36Z) - Ethio-Fake: Cutting-Edge Approaches to Combat Fake News in Under-Resourced Languages Using Explainable AI [44.21078435758592]
Misinformation can spread quickly due to the ease of creating and disseminating content.
Traditional approaches to fake news detection often rely solely on content-based features.
We propose a comprehensive approach that integrates social context-based features with news content features.
arXiv Detail & Related papers (2024-10-03T15:49:35Z) - Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - FNDaaS: Content-agnostic Detection of Fake News sites [3.936965297430477]
We propose FND, the first automatic, content-agnostic fake news detection method.
It considers new and unstudied features such as network and structural characteristics per news website.
It can achieve an AUC score of up to 0.967 on past sites, and up to 77-92% accuracy on newly-flagged ones.
arXiv Detail & Related papers (2022-12-13T11:17:32Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Two Stage Transformer Model for COVID-19 Fake News Detection and Fact
Checking [0.3441021278275805]
We develop a two stage automated pipeline for COVID-19 fake news detection using state of the art machine learning models for natural language processing.
The first model leverages a novel fact checking algorithm that retrieves the most relevant facts concerning user claims about particular COVID-19 claims.
The second model verifies the level of truth in the claim by computing the textual entailment between the claim and the true facts retrieved from a manually curated COVID-19 dataset.
arXiv Detail & Related papers (2020-11-26T11:50:45Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z) - Machine Learning Explanations to Prevent Overtrust in Fake News
Detection [64.46876057393703]
This research investigates the effects of an Explainable AI assistant embedded in news review platforms for combating the propagation of fake news.
We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms.
For a deeper understanding of Explainable AI systems, we discuss interactions between user engagement, mental model, trust, and performance measures in the process of explaining.
arXiv Detail & Related papers (2020-07-24T05:42:29Z) - A Deep Learning Approach for Automatic Detection of Fake News [47.00462375817434]
We propose two models based on deep learning for solving fake news detection problem in online news contents of multiple domains.
We evaluate our techniques on the two recently released datasets, namely FakeNews AMT and Celebrity for fake news detection.
arXiv Detail & Related papers (2020-05-11T09:07:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.