Generalizing to the Future: Mitigating Entity Bias in Fake News
Detection
- URL: http://arxiv.org/abs/2204.09484v1
- Date: Wed, 20 Apr 2022 14:32:34 GMT
- Title: Generalizing to the Future: Mitigating Entity Bias in Fake News
Detection
- Authors: Yongchun Zhu, Qiang Sheng, Juan Cao, Shuokai Li, Danding Wang, Fuzhen
Zhuang
- Abstract summary: We propose an entity debiasing framework (textbfENDEF) which generalizes fake news detection models to the future data.
Based on the causal graph among entities, news contents, and news veracity, we separately model the contribution of each cause.
In the inference stage, we remove the direct effect of the entities to mitigate entity bias.
- Score: 30.493485490419403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The wide dissemination of fake news is increasingly threatening both
individuals and society. Fake news detection aims to train a model on the past
news and detect fake news of the future. Though great efforts have been made,
existing fake news detection methods overlooked the unintended entity bias in
the real-world data, which seriously influences models' generalization ability
to future data. For example, 97\% of news pieces in 2010-2017 containing the
entity `Donald Trump' are real in our data, but the percentage falls down to
merely 33\% in 2018. This would lead the model trained on the former set to
hardly generalize to the latter, as it tends to predict news pieces about
`Donald Trump' as real for lower training loss. In this paper, we propose an
entity debiasing framework (\textbf{ENDEF}) which generalizes fake news
detection models to the future data by mitigating entity bias from a
cause-effect perspective. Based on the causal graph among entities, news
contents, and news veracity, we separately model the contribution of each cause
(entities and contents) during training. In the inference stage, we remove the
direct effect of the entities to mitigate entity bias. Extensive offline
experiments on the English and Chinese datasets demonstrate that the proposed
framework can largely improve the performance of base fake news detectors, and
online tests verify its superiority in practice. To the best of our knowledge,
this is the first work to explicitly improve the generalization ability of fake
news detection models to the future data. The code has been released at
https://github.com/ICTMCG/ENDEF-SIGIR2022.
Related papers
- FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections [5.15641542196944]
We introduce FakeWatch, a comprehensive framework carefully designed to detect fake news.
Our framework integrates a model hub comprising of both traditional machine learning (ML) techniques, and state-of-the-art Language Models (LMs)
Our objective is to provide the research community with adaptable and precise classification models adept at identifying fake news for the elections agenda.
arXiv Detail & Related papers (2024-03-14T20:39:26Z) - Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - Nothing Stands Alone: Relational Fake News Detection with Hypergraph
Neural Networks [49.29141811578359]
We propose to leverage a hypergraph to represent group-wise interaction among news, while focusing on important news relations with its dual-level attention mechanism.
Our approach yields remarkable performance and maintains the high performance even with a small subset of labeled news data.
arXiv Detail & Related papers (2022-12-24T00:19:32Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - User Preference-aware Fake News Detection [61.86175081368782]
Existing fake news detection algorithms focus on mining news content for deceptive signals.
We propose a new framework, UPFD, which simultaneously captures various signals from user preferences by joint content and graph modeling.
arXiv Detail & Related papers (2021-04-25T21:19:24Z) - Evaluating Deep Learning Approaches for Covid19 Fake News Detection [0.0]
We look at automated techniques for fake news detection from a data mining perspective.
We evaluate different supervised text classification algorithms on Contraint@AAAI 2021 Covid-19 Fake news detection dataset.
We report the best accuracy of 98.41% on the Covid-19 Fake news detection dataset.
arXiv Detail & Related papers (2021-01-11T16:39:03Z) - Causal Understanding of Fake News Dissemination on Social Media [50.4854427067898]
We argue that it is critical to understand what user attributes potentially cause users to share fake news.
In fake news dissemination, confounders can be characterized by fake news sharing behavior that inherently relates to user attributes and online activities.
We propose a principled approach to alleviating selection bias in fake news dissemination.
arXiv Detail & Related papers (2020-10-20T19:37:04Z) - Connecting the Dots Between Fact Verification and Fake News Detection [21.564628184287173]
We propose a simple yet effective approach to connect the dots between fact verification and fake news detection.
Our approach makes use of the recent success of fact verification models and enables zero-shot fake news detection.
arXiv Detail & Related papers (2020-10-11T09:28:52Z) - Viable Threat on News Reading: Generating Biased News Using Natural
Language Models [49.90665530780664]
We show that publicly available language models can reliably generate biased news content based on an input original news.
We also show that a large number of high-quality biased news articles can be generated using controllable text generation.
arXiv Detail & Related papers (2020-10-05T16:55:39Z) - Weak Supervision for Fake News Detection via Reinforcement Learning [34.448503443582396]
We propose a weakly-supervised fake news detection framework, i.e., WeFEND.
The proposed framework consists of three main components: the annotator, the reinforced selector and the fake news detector.
We tested the proposed framework on a large collection of news articles published via WeChat official accounts and associated user reports.
arXiv Detail & Related papers (2019-12-28T21:20:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.