Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis
- URL: http://arxiv.org/abs/2305.13654v3
- Date: Sat, 3 Feb 2024 16:44:36 GMT
- Title: Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis
- Authors: Oscar Chew, Hsuan-Tien Lin, Kai-Wei Chang, Kuan-Hao Huang
- Abstract summary: Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances.
In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis.
We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
- Score: 69.07674653828565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research has revealed that machine learning models have a tendency to
leverage spurious correlations that exist in the training set but may not hold
true in general circumstances. For instance, a sentiment classifier may
erroneously learn that the token "performances" is commonly associated with
positive movie reviews. Relying on these spurious correlations degrades the
classifiers performance when it deploys on out-of-distribution data. In this
paper, we examine the implications of spurious correlations through a novel
perspective called neighborhood analysis. The analysis uncovers how spurious
correlations lead unrelated words to erroneously cluster together in the
embedding space. Driven by the analysis, we design a metric to detect spurious
tokens and also propose a family of regularization methods, NFL (doN't Forget
your Language) to mitigate spurious correlations in text classification.
Experiments show that NFL can effectively prevent erroneous clusters and
significantly improve the robustness of classifiers without auxiliary data. The
code is publicly available at
https://github.com/oscarchew/doNt-Forget-your-Language.
Related papers
- Spuriousness-Aware Meta-Learning for Learning Robust Classifiers [26.544938760265136]
Spurious correlations are brittle associations between certain attributes of inputs and target variables.
Deep image classifiers often leverage them for predictions, leading to poor generalization on the data where the correlations do not hold.
Mitigating the impact of spurious correlations is crucial towards robust model generalization, but it often requires annotations of the spurious correlations in data.
arXiv Detail & Related papers (2024-06-15T21:41:25Z) - Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation [26.544938760265136]
Deep neural classifiers rely on spurious correlations between spurious attributes of inputs and targets to make predictions.
We propose a self-guided spurious correlation mitigation framework.
We show that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori.
arXiv Detail & Related papers (2024-05-06T17:12:21Z) - Unsupervised Concept Discovery Mitigates Spurious Correlations [45.48778210340187]
Models prone to spurious correlations in training data often produce brittle predictions and introduce unintended biases.
In this paper, we establish a novel connection between unsupervised object-centric learning and mitigation of spurious correlations.
We introduce CoBalT: a concept balancing technique that effectively mitigates spurious correlations without requiring human labeling of subgroups.
arXiv Detail & Related papers (2024-02-20T20:48:00Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Identifying Spurious Correlations using Counterfactual Alignment [5.782952470371709]
Models driven by spurious correlations often yield poor generalization performance.
We propose the counterfactual (CF) alignment method to detect and quantify spurious correlations.
arXiv Detail & Related papers (2023-12-01T20:16:02Z) - Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant.
Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z) - Language Model Classifier Aligns Better with Physician Word Sensitivity
than XGBoost on Readmission Prediction [86.15787587540132]
We introduce sensitivity score, a metric that scrutinizes models' behaviors at the vocabulary level.
Our experiments compare the decision-making logic of clinicians and classifiers based on rank correlations of sensitivity scores.
arXiv Detail & Related papers (2022-11-13T23:59:11Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Counterfactual Invariance to Spurious Correlations: Why and How to Pass
Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter.
In machine learning, these have a know-it-when-you-see-it character.
We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z) - Identifying Spurious Correlations for Robust Text Classification [9.457737910527829]
We propose a method to distinguish spurious and genuine correlations in text classification.
We use features derived from treatment effect estimators to distinguish spurious correlations from "genuine" ones.
Experiments on four datasets suggest that using this approach to inform feature selection also leads to more robust classification.
arXiv Detail & Related papers (2020-10-06T03:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.