Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach
- URL: http://arxiv.org/abs/2403.00250v1
- Date: Fri, 1 Mar 2024 03:27:08 GMT
- Title: Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach
- Authors: Han Lu, Siyu Sun, Yichen Xie, Liqing Zhang, Xiaokang Yang, Junchi Yan
- Abstract summary: We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
- Score: 102.0769560460338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the long-tailed recognition field, the Decoupled Training paradigm has
demonstrated remarkable capabilities among various methods. This paradigm
decouples the training process into separate representation learning and
classifier re-training. Previous works have attempted to improve both stages
simultaneously, making it difficult to isolate the effect of classifier
re-training. Furthermore, recent empirical studies have demonstrated that
simple regularization can yield strong feature representations, emphasizing the
need to reassess existing classifier re-training methods. In this study, we
revisit classifier re-training methods based on a unified feature
representation and re-evaluate their performances. We propose a new metric
called Logits Magnitude as a superior measure of model performance, replacing
the commonly used Weight Norm. However, since it is hard to directly optimize
the new metric during training, we introduce a suitable approximate invariant
called Regularized Standard Deviation. Based on the two newly proposed metrics,
we prove that reducing the absolute value of Logits Magnitude when it is nearly
balanced can effectively decrease errors and disturbances during training,
leading to better model performance. Motivated by these findings, we develop a
simple logits retargeting approach (LORT) without the requirement of prior
knowledge of the number of samples per class. LORT divides the original one-hot
label into small true label probabilities and large negative label
probabilities distributed across each class. Our method achieves
state-of-the-art performance on various imbalanced datasets, including
CIFAR100-LT, ImageNet-LT, and iNaturalist2018.
Related papers
- An Efficient Replay for Class-Incremental Learning with Pre-trained Models [0.0]
In class-incremental learning, the steady state among the weight guided by each class center is disrupted, which is significantly correlated with forgetting.
We propose a new method to overcoming forgetting.
arXiv Detail & Related papers (2024-08-15T11:26:28Z) - Bias Mitigating Few-Shot Class-Incremental Learning [17.185744533050116]
Few-shot class-incremental learning aims at recognizing novel classes continually with limited novel class samples.
Recent methods somewhat alleviate the accuracy imbalance between base and incremental classes by fine-tuning the feature extractor in the incremental sessions.
We propose a novel method to mitigate model bias of the FSCIL problem during training and inference processes.
arXiv Detail & Related papers (2024-02-01T10:37:41Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - You Only Need End-to-End Training for Long-Tailed Recognition [8.789819609485225]
Cross-entropy loss tends to produce highly correlated features on imbalanced data.
We propose two novel modules, Block-based Relatively Balanced Batch Sampler (B3RS) and Batch Embedded Training (BET)
Experimental results on the long-tailed classification benchmarks, CIFAR-LT and ImageNet-LT, demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2021-12-11T11:44:09Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Few-shot Learning via Dependency Maximization and Instance Discriminant
Analysis [21.8311401851523]
We study the few-shot learning problem, where a model learns to recognize new objects with extremely few labeled data per category.
We propose a simple approach to exploit unlabeled data accompanying the few-shot task for improving few-shot performance.
arXiv Detail & Related papers (2021-09-07T02:19:01Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.