Mitigation of Diachronic Bias in Fake News Detection Dataset
- URL: http://arxiv.org/abs/2108.12601v1
- Date: Sat, 28 Aug 2021 08:25:29 GMT
- Title: Mitigation of Diachronic Bias in Fake News Detection Dataset
- Authors: Taichi Murayama and Shoko Wakamiya and Eiji Aramaki
- Abstract summary: Most of the fake news datasets depend on a specific time period.
The detection models trained on such a dataset have difficulty detecting novel fake news generated by political changes and social changes.
We propose masking methods using Wikidata to mitigate the influence of person names and validate whether they make fake news detection models robust.
- Score: 3.2800968305157205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fake news causes significant damage to society.To deal with these fake news,
several studies on building detection models and arranging datasets have been
conducted. Most of the fake news datasets depend on a specific time period.
Consequently, the detection models trained on such a dataset have difficulty
detecting novel fake news generated by political changes and social changes;
they may possibly result in biased output from the input, including specific
person names and organizational names. We refer to this problem as
\textbf{Diachronic Bias} because it is caused by the creation date of news in
each dataset. In this study, we confirm the bias, especially proper nouns
including person names, from the deviation of phrase appearances in each
dataset. Based on these findings, we propose masking methods using Wikidata to
mitigate the influence of person names and validate whether they make fake news
detection models robust through experiments with in-domain and out-of-domain
data.
Related papers
- Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - Unsupervised Domain-agnostic Fake News Detection using Multi-modal Weak
Signals [19.22829945777267]
This work proposes an effective framework for unsupervised fake news detection, which first embeds the knowledge available in four modalities in news records.
Also, we propose a novel technique to construct news datasets minimizing the latent biases in existing news datasets.
We trained the proposed unsupervised framework using LUND-COVID to exploit the potential of large datasets.
arXiv Detail & Related papers (2023-05-18T23:49:31Z) - Nothing Stands Alone: Relational Fake News Detection with Hypergraph
Neural Networks [49.29141811578359]
We propose to leverage a hypergraph to represent group-wise interaction among news, while focusing on important news relations with its dual-level attention mechanism.
Our approach yields remarkable performance and maintains the high performance even with a small subset of labeled news data.
arXiv Detail & Related papers (2022-12-24T00:19:32Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Generalizing to the Future: Mitigating Entity Bias in Fake News
Detection [30.493485490419403]
We propose an entity debiasing framework (textbfENDEF) which generalizes fake news detection models to the future data.
Based on the causal graph among entities, news contents, and news veracity, we separately model the contribution of each cause.
In the inference stage, we remove the direct effect of the entities to mitigate entity bias.
arXiv Detail & Related papers (2022-04-20T14:32:34Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - User Preference-aware Fake News Detection [61.86175081368782]
Existing fake news detection algorithms focus on mining news content for deceptive signals.
We propose a new framework, UPFD, which simultaneously captures various signals from user preferences by joint content and graph modeling.
arXiv Detail & Related papers (2021-04-25T21:19:24Z) - Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets.
We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap.
We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z) - Embracing Domain Differences in Fake News: Cross-domain Fake News
Detection using Multi-modal Data [18.66426327152407]
We propose a novel framework that jointly preserves domain-specific and cross-domain knowledge in news records to detect fake news from different domains.
Our experiments show that the integration of the proposed fake news model and the selective annotation approach achieves state-of-the-art performance for cross-domain news datasets.
arXiv Detail & Related papers (2021-02-11T23:31:14Z) - Connecting the Dots Between Fact Verification and Fake News Detection [21.564628184287173]
We propose a simple yet effective approach to connect the dots between fact verification and fake news detection.
Our approach makes use of the recent success of fact verification models and enables zero-shot fake news detection.
arXiv Detail & Related papers (2020-10-11T09:28:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.