Identifying Spurious Correlations and Correcting them with an
Explanation-based Learning
- URL: http://arxiv.org/abs/2211.08285v1
- Date: Tue, 15 Nov 2022 16:34:53 GMT
- Title: Identifying Spurious Correlations and Correcting them with an
Explanation-based Learning
- Authors: Misgina Tsighe Hagos, Kathleen M. Curran, Brian Mac Namee
- Abstract summary: We present a simple method to identify spurious correlations that have been learned by a model trained for image classification problems.
We apply image-level perturbations and monitor changes in certainties of predictions made using the trained model.
- Score: 4.039245878626345
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Identifying spurious correlations learned by a trained model is at the core
of refining a trained model and building a trustworthy model. We present a
simple method to identify spurious correlations that have been learned by a
model trained for image classification problems. We apply image-level
perturbations and monitor changes in certainties of predictions made using the
trained model. We demonstrate this approach using an image classification
dataset that contains images with synthetically generated spurious regions and
show that the trained model was overdependent on spurious regions. Moreover, we
remove the learned spurious correlations with an explanation based learning
approach.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Common-Sense Bias Discovery and Mitigation for Classification Tasks [16.8259488742528]
We propose a framework to extract feature clusters in a dataset based on image descriptions.
The analyzed features and correlations are human-interpretable, so we name the method Common-Sense Bias Discovery (CSBD)
Experiments show that our method discovers novel biases on multiple classification tasks for two benchmark image datasets.
arXiv Detail & Related papers (2024-01-24T03:56:07Z) - Specify Robust Causal Representation from Mixed Observations [35.387451486213344]
Learning representations purely from observations concerns the problem of learning a low-dimensional, compact representation which is beneficial to prediction models.
We develop a learning method to learn such representation from observational data by regularizing the learning procedure with mutual information measures.
We theoretically and empirically show that the models trained with the learned causal representations are more robust under adversarial attacks and distribution shifts.
arXiv Detail & Related papers (2023-10-21T02:18:35Z) - Diffusion Models Beat GANs on Image Classification [37.70821298392606]
Diffusion models have risen to prominence as a state-of-the-art method for image generation, denoising, inpainting, super-resolution, manipulation, etc.
We present our findings that these embeddings are useful beyond the noise prediction task, as they contain discriminative information and can also be leveraged for classification.
We find that with careful feature selection and pooling, diffusion models outperform comparable generative-discriminative methods for classification tasks.
arXiv Detail & Related papers (2023-07-17T17:59:40Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - A Relational Model for One-Shot Classification [80.77724423309184]
We show that a deep learning model with built-in inductive bias can bring benefits to sample-efficient learning, without relying on extensive data augmentation.
The proposed one-shot classification model performs relational matching of a pair of inputs in the form of local and pairwise attention.
arXiv Detail & Related papers (2021-11-08T07:53:12Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Towards Visually Explaining Similarity Models [29.704524987493766]
We present a method to generate gradient-based visual attention for image similarity predictors.
By relying solely on the learned feature embedding, we show that our approach can be applied to any kind of CNN-based similarity architecture.
We show that our resulting attention maps serve more than just interpretability; they can be infused into the model learning process itself with new trainable constraints.
arXiv Detail & Related papers (2020-08-13T17:47:41Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.