Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers
- URL: http://arxiv.org/abs/2211.15231v1
- Date: Mon, 28 Nov 2022 11:27:50 GMT
- Title: Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers
- Authors: Wanqian Yang, Polina Kirichenko, Micah Goldblum, Andrew Gordon Wilson
- Abstract summary: We show that generative models alone are not sufficient to prevent shortcut learning.
In particular, we propose Chroma-VAE, a two-pronged approach where a VAE is initially trained to isolate the shortcut in a small latent subspace.
In addition to demonstrating the efficacy of Chroma-VAE on benchmark and real-world shortcut learning tasks, our work highlights the potential for manipulating the latent space of generative classifiers to isolate or interpret specific correlations.
- Score: 44.97660597940641
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks are susceptible to shortcut learning, using simple
features to achieve low training loss without discovering essential semantic
structure. Contrary to prior belief, we show that generative models alone are
not sufficient to prevent shortcut learning, despite an incentive to recover a
more comprehensive representation of the data than discriminative approaches.
However, we observe that shortcuts are preferentially encoded with minimal
information, a fact that generative models can exploit to mitigate shortcut
learning. In particular, we propose Chroma-VAE, a two-pronged approach where a
VAE classifier is initially trained to isolate the shortcut in a small latent
subspace, allowing a secondary classifier to be trained on the complementary,
shortcut-free latent subspace. In addition to demonstrating the efficacy of
Chroma-VAE on benchmark and real-world shortcut learning tasks, our work
highlights the potential for manipulating the latent space of generative
classifiers to isolate or interpret specific correlations.
Related papers
- Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut
Learning in Text Classification by Language Models [20.70050968223901]
This study addresses the overlooked impact of subtler, more complex shortcuts that compromise model reliability beyond oversimplified shortcuts.
We introduce a comprehensive benchmark that categorizes shortcuts into occurrence, style, and concept.
Our research systematically investigates models' resilience and susceptibilities to sophisticated shortcuts.
arXiv Detail & Related papers (2024-09-26T01:17:42Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Beyond Prototypes: Semantic Anchor Regularization for Better
Representation Learning [82.29761875805369]
One of the ultimate goals of representation learning is to achieve compactness within a class and well-separability between classes.
We propose a novel perspective to use pre-defined class anchors serving as feature centroid to unidirectionally guide feature learning.
The proposed Semantic Anchor Regularization (SAR) can be used in a plug-and-play manner in the existing models.
arXiv Detail & Related papers (2023-12-19T05:52:38Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Improving Deep Representation Learning via Auxiliary Learnable Target Coding [69.79343510578877]
This paper introduces a novel learnable target coding as an auxiliary regularization of deep representation learning.
Specifically, a margin-based triplet loss and a correlation consistency loss on the proposed target codes are designed to encourage more discriminative representations.
arXiv Detail & Related papers (2023-05-30T01:38:54Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - Subspace Regularizers for Few-Shot Class Incremental Learning [26.372024890126408]
We present a new family of subspace regularization schemes that encourage weight vectors for new classes to lie close to the subspace spanned by the weights of existing classes.
Our results show that simple geometric regularization of class representations offers an effective tool for continual learning.
arXiv Detail & Related papers (2021-10-13T22:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.