Towards Robust Classification Model by Counterfactual and Invariant Data
Generation
- URL: http://arxiv.org/abs/2106.01127v2
- Date: Thu, 3 Jun 2021 06:14:35 GMT
- Title: Towards Robust Classification Model by Counterfactual and Invariant Data
Generation
- Authors: Chun-Hao Chang, George Alexandru Adam, Anna Goldenberg
- Abstract summary: Spuriousness occurs when some features correlate with labels but are not causal.
We propose two data generation processes to reduce spuriousness.
Our data generations outperform state-of-the-art methods in accuracy when spurious correlations break.
- Score: 7.488317734152585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the success of machine learning applications in science, industry,
and society in general, many approaches are known to be non-robust, often
relying on spurious correlations to make predictions. Spuriousness occurs when
some features correlate with labels but are not causal; relying on such
features prevents models from generalizing to unseen environments where such
correlations break. In this work, we focus on image classification and propose
two data generation processes to reduce spuriousness. Given human annotations
of the subset of the features responsible (causal) for the labels (e.g.
bounding boxes), we modify this causal set to generate a surrogate image that
no longer has the same label (i.e. a counterfactual image). We also alter
non-causal features to generate images still recognized as the original labels,
which helps to learn a model invariant to these features. In several
challenging datasets, our data generations outperform state-of-the-art methods
in accuracy when spurious correlations break, and increase the saliency focus
on causal features providing better explanations.
Related papers
- Towards Robust Text Classification: Mitigating Spurious Correlations with Causal Learning [2.7813683000222653]
We propose the Causally Calibrated Robust ( CCR) to reduce models' reliance on spurious correlations.
CCR integrates a causal feature selection method based on counterfactual reasoning, along with an inverse propensity weighting (IPW) loss function.
We show that CCR state-of-the-art performance among methods without group labels, and in some cases, it can compete with the models that utilize group labels.
arXiv Detail & Related papers (2024-11-01T21:29:07Z) - Counterfactual Image Editing [54.21104691749547]
Counterfactual image editing is an important task in generative AI, which asks how an image would look if certain features were different.
We formalize the counterfactual image editing task using formal language, modeling the causal relationships between latent generative factors and images.
We develop an efficient algorithm to generate counterfactual images by leveraging neural causal models.
arXiv Detail & Related papers (2024-02-07T20:55:39Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Posterior Collapse and Latent Variable Non-identifiability [54.842098835445]
We propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility.
Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
arXiv Detail & Related papers (2023-01-02T06:16:56Z) - PatchMix Augmentation to Identify Causal Features in Few-shot Learning [55.64873998196191]
Few-shot learning aims to transfer knowledge learned from base with sufficient categories labelled data to novel categories with scarce known information.
We propose a novel data augmentation strategy dubbed as PatchMix that can break this spurious dependency.
We show that such an augmentation mechanism, different from existing ones, is able to identify the causal features.
arXiv Detail & Related papers (2022-11-29T08:41:29Z) - Counterfactual Generation Under Confounding [24.503075567519048]
A machine learning model, under the influence of observed or unobserved confounders in the training data, can learn spurious correlations.
We propose a counterfactual generation method that learns to modify the value of any attribute in an image and generate new images given a set of observed attributes.
Our method is computationally efficient, simple to implement, and works well for any number of generative factors and confounding variables.
arXiv Detail & Related papers (2022-10-22T06:39:22Z) - Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation [32.66196135141696]
Features with varying relationships to the label are nuisances.
Models that exploit nuisance-label relationships face performance degradation when these relationships change.
We develop an approach to use knowledge about the semantics by corrupting them in data.
arXiv Detail & Related papers (2022-10-04T01:40:31Z) - Preserving Fine-Grain Feature Information in Classification via Entropic
Regularization [10.358087436626391]
We show that standard cross-entropy can lead to overfitting to coarse-related features.
We introduce an entropy-based regularization to promote more diversity in the feature space of trained models.
arXiv Detail & Related papers (2022-08-07T09:25:57Z) - Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings.
We then show that the causal effect, which severs all sources of confounding, remains invariant across domains.
This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z) - Active Learning by Feature Mixing [52.16150629234465]
We propose a novel method for batch active learning called ALFA-Mix.
We identify unlabelled instances with sufficiently-distinct features by seeking inconsistencies in predictions.
We show that inconsistencies in these predictions help discovering features that the model is unable to recognise in the unlabelled instances.
arXiv Detail & Related papers (2022-03-14T12:20:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.