Exploring Generalizability of Fine-Tuned Models for Fake News Detection
- URL: http://arxiv.org/abs/2205.07154v1
- Date: Sun, 15 May 2022 00:00:49 GMT
- Title: Exploring Generalizability of Fine-Tuned Models for Fake News Detection
- Authors: Abhijit Suprem, Calton Pu
- Abstract summary: Covid-19 pandemic has caused a dramatic and parallel rise in dangerous misinformation, denoted an infodemic' by the CDC and WHO.
Misinformation tied to the Covid-19 infodemic changes continuously; this can lead to performance degradation of fine-tuned models due to concept drift.
In this paper, we explore generalizability of pre-trained and fine-tuned fake news detectors across 9 fake news datasets.
- Score: 3.210653757360955
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Covid-19 pandemic has caused a dramatic and parallel rise in dangerous
misinformation, denoted an `infodemic' by the CDC and WHO. Misinformation tied
to the Covid-19 infodemic changes continuously; this can lead to performance
degradation of fine-tuned models due to concept drift. Degredation can be
mitigated if models generalize well-enough to capture some cyclical aspects of
drifted data. In this paper, we explore generalizability of pre-trained and
fine-tuned fake news detectors across 9 fake news datasets. We show that
existing models often overfit on their training dataset and have poor
performance on unseen data. However, on some subsets of unseen data that
overlap with training data, models have higher accuracy. Based on this
observation, we also present KMeans-Proxy, a fast and effective method based on
K-Means clustering for quickly identifying these overlapping subsets of unseen
data. KMeans-Proxy improves generalizability on unseen fake news datasets by
0.1-0.2 f1-points across datasets. We present both our generalizability
experiments as well as KMeans-Proxy to further research in tackling the fake
news problem.
Related papers
- Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Improving Generalization for Multimodal Fake News Detection [8.595270610973586]
State-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics.
We propose three models that adopt and fine-tune state-of-the-art multimodal transformers for multimodal fake news detection.
arXiv Detail & Related papers (2023-05-29T20:32:22Z) - MRCLens: an MRC Dataset Bias Detection Toolkit [82.44296974850639]
We introduce MRCLens, a toolkit that detects whether biases exist before users train the full model.
For the convenience of introducing the toolkit, we also provide a categorization of common biases in MRC.
arXiv Detail & Related papers (2022-07-18T21:05:39Z) - CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal
Relationships [8.679073301435265]
We construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data.
We use these labels to perturb the data by deleting non-causal agents from the scene.
Under non-causal perturbations, we observe a $25$-$38%$ relative change in minADE as compared to the original.
arXiv Detail & Related papers (2022-07-07T21:28:23Z) - Testing the Generalization of Neural Language Models for COVID-19
Misinformation Detection [6.1204874238049705]
A drastic rise in potentially life-threatening misinformation has been a by-product of the COVID-19 pandemic.
We evaluate fifteen Transformer-based models on five COVID-19 misinformation datasets.
We show tokenizers and models tailored to COVID-19 data do not provide a significant advantage over general-purpose ones.
arXiv Detail & Related papers (2021-11-15T15:01:55Z) - Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets.
We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap.
We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z) - Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake
News Detection [7.29381091750894]
We propose a novel transformer-based language model fine-tuning approach for these fake news detection.
First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases.
Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations.
arXiv Detail & Related papers (2021-01-14T09:05:42Z) - Deep k-NN for Noisy Labels [55.97221021252733]
We show that a simple $k$-nearest neighbor-based filtering approach on the logit layer of a preliminary model can remove mislabeled data and produce more accurate models than many recently proposed methods.
arXiv Detail & Related papers (2020-04-26T05:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.