Hidden Biases in Unreliable News Detection Datasets
- URL: http://arxiv.org/abs/2104.10130v1
- Date: Tue, 20 Apr 2021 17:16:41 GMT
- Title: Hidden Biases in Unreliable News Detection Datasets
- Authors: Xiang Zhou, Heba Elfardy, Christos Christodoulopoulos, Thomas Butler,
Mohit Bansal
- Abstract summary: We show that selection bias during data collection leads to undesired artifacts in the datasets.
We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap.
We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
- Score: 60.71991809782698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic unreliable news detection is a research problem with great
potential impact. Recently, several papers have shown promising results on
large-scale news datasets with models that only use the article itself without
resorting to any fact-checking mechanism or retrieving any supporting evidence.
In this work, we take a closer look at these datasets. While they all provide
valuable resources for future research, we observe a number of problems that
may lead to results that do not generalize in more realistic settings.
Specifically, we show that selection bias during data collection leads to
undesired artifacts in the datasets. In addition, while most systems train and
predict at the level of individual articles, overlapping article sources in the
training and evaluation data can provide a strong confounding factor that
models can exploit. In the presence of this confounding factor, the models can
achieve good performance by directly memorizing the site-label mapping instead
of modeling the real task of unreliable news detection. We observed a
significant drop (>10%) in accuracy for all models tested in a clean split with
no train/test source overlap. Using the observations and experimental results,
we provide practical suggestions on how to create more reliable datasets for
the unreliable news detection task. We suggest future dataset creation include
a simple model as a difficulty/bias probe and future model development use a
clean non-overlapping site and date split.
Related papers
- Addressing Bias Through Ensemble Learning and Regularized Fine-Tuning [0.2812395851874055]
This paper proposes a comprehensive approach using multiple methods to remove bias in AI models.
We train multiple models with the counter-bias of the pre-trained model through data splitting, local training, and regularized fine-tuning.
We conclude our solution with knowledge distillation that results in a single unbiased neural network.
arXiv Detail & Related papers (2024-02-01T09:24:36Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Fighting Bias with Bias: Promoting Model Robustness by Amplifying
Dataset Biases [5.997909991352044]
Recent work sought to develop robust, unbiased models by filtering biased examples from training sets.
We argue that such filtering can obscure the true capabilities of models to overcome biases.
We introduce an evaluation framework defined by a bias-amplified training set and an anti-biased test set.
arXiv Detail & Related papers (2023-05-30T10:10:42Z) - Localized Shortcut Removal [4.511561231517167]
High performance on held-out test data does not necessarily indicate that a model generalizes or learns anything meaningful.
This is often due to the existence of machine learning shortcuts - features in the data that are predictive but unrelated to the problem at hand.
We use an adversarially trained lens to detect and eliminate highly predictive but semantically unconnected clues in images.
arXiv Detail & Related papers (2022-11-24T13:05:33Z) - Addressing Bias in Face Detectors using Decentralised Data collection
with incentives [0.0]
We show how this data-centric approach can be facilitated in a decentralized manner to enable efficient data collection for algorithms.
We propose a face detection and anonymization approach using a hybrid MultiTask Cascaded CNN with FaceNet Embeddings.
arXiv Detail & Related papers (2022-10-28T09:54:40Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Dataset Cartography: Mapping and Diagnosing Datasets with Training
Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets.
We leverage a largely ignored source of information: the behavior of the model on individual instances during training.
Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z) - Debiasing Skin Lesion Datasets and Models? Not So Fast [17.668005682385175]
Models learned from data risk learning biases from that same data.
When models learn spurious correlations not found in real-world situations, their deployment for critical tasks, such as medical decisions, can be catastrophic.
We find out that, despite interesting results that point to promising future research, current debiasing methods are not ready to solve the bias issue for skin-lesion models.
arXiv Detail & Related papers (2020-04-23T21:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.