Identifying Spurious Correlations for Robust Text Classification
- URL: http://arxiv.org/abs/2010.02458v1
- Date: Tue, 6 Oct 2020 03:49:22 GMT
- Title: Identifying Spurious Correlations for Robust Text Classification
- Authors: Zhao Wang and Aron Culotta
- Abstract summary: We propose a method to distinguish spurious and genuine correlations in text classification.
We use features derived from treatment effect estimators to distinguish spurious correlations from "genuine" ones.
Experiments on four datasets suggest that using this approach to inform feature selection also leads to more robust classification.
- Score: 9.457737910527829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The predictions of text classifiers are often driven by spurious correlations
-- e.g., the term `Spielberg' correlates with positively reviewed movies, even
though the term itself does not semantically convey a positive sentiment. In
this paper, we propose a method to distinguish spurious and genuine
correlations in text classification. We treat this as a supervised
classification problem, using features derived from treatment effect estimators
to distinguish spurious correlations from "genuine" ones. Due to the generic
nature of these features and their small dimensionality, we find that the
approach works well even with limited training examples, and that it is
possible to transport the word classifier to new domains. Experiments on four
datasets (sentiment classification and toxicity detection) suggest that using
this approach to inform feature selection also leads to more robust
classification, as measured by improved worst-case accuracy on the samples
affected by spurious correlations.
Related papers
- Estimating the Influence of Sequentially Correlated Literary Properties in Textual Classification: A Data-Centric Hypothesis-Testing Approach [4.161155428666988]
Stylometry aims to distinguish authors by analyzing literary traits assumed to reflect semi-conscious choices distinct from elements like genre or theme.
While some literary properties, such as thematic content, are likely to manifest as correlations between adjacent text units, others, like authorial style, may be independent thereof.
We introduce a hypothesis-testing approach to evaluate the influence of sequentially correlated literary properties on text classification.
arXiv Detail & Related papers (2024-11-07T18:28:40Z) - Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation [26.544938760265136]
Deep neural classifiers rely on spurious correlations between spurious attributes of inputs and targets to make predictions.
We propose a self-guided spurious correlation mitigation framework.
We show that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori.
arXiv Detail & Related papers (2024-05-06T17:12:21Z) - Identifying Spurious Correlations using Counterfactual Alignment [5.782952470371709]
Models driven by spurious correlations often yield poor generalization performance.
We propose the counterfactual (CF) alignment method to detect and quantify spurious correlations.
arXiv Detail & Related papers (2023-12-01T20:16:02Z) - Causal Effect Regularization: Automated Detection and Removal of
Spurious Attributes [13.852987916253685]
In many classification datasets, the task labels are spuriously correlated with some input attributes.
We propose a method to automatically identify spurious attributes by estimating their causal effect on the label.
Our method mitigates the reliance on spurious attributes even under noisy estimation of causal effects.
arXiv Detail & Related papers (2023-06-19T17:17:42Z) - Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis [69.07674653828565]
Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances.
In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis.
We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
arXiv Detail & Related papers (2023-05-23T03:55:50Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - Tweet Sentiment Quantification: An Experimental Re-Evaluation [88.60021378715636]
Sentiment quantification is the task of training, by means of supervised learning, estimators of the relative frequency (also called prevalence'') of sentiment-related classes.
We re-evaluate those quantification methods following a now consolidated and much more robust experimental protocol.
Results are dramatically different from those obtained by Gao Gao Sebastiani, and they provide a different, much more solid understanding of the relative strengths and weaknesses of different sentiment quantification methods.
arXiv Detail & Related papers (2020-11-04T21:41:34Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - Learning from Aggregate Observations [82.44304647051243]
We study the problem of learning from aggregate observations where supervision signals are given to sets of instances.
We present a general probabilistic framework that accommodates a variety of aggregate observations.
Simple maximum likelihood solutions can be applied to various differentiable models.
arXiv Detail & Related papers (2020-04-14T06:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.