How Re-sampling Helps for Long-Tail Learning?
- URL: http://arxiv.org/abs/2310.18236v1
- Date: Fri, 27 Oct 2023 16:20:34 GMT
- Title: How Re-sampling Helps for Long-Tail Learning?
- Authors: Jiang-Xin Shi, Tong Wei, Yuke Xiang, Yu-Feng Li
- Abstract summary: Long-tail learning has received significant attention due to the challenge it poses with extremely imbalanced datasets.
Recent studies claim that re-sampling brings negligible performance improvements in modern long-tail learning tasks.
We propose a new context shift augmentation module that generates diverse training images for the tail class.
- Score: 45.187004699024435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-tail learning has received significant attention in recent years due to
the challenge it poses with extremely imbalanced datasets. In these datasets,
only a few classes (known as the head classes) have an adequate number of
training samples, while the rest of the classes (known as the tail classes) are
infrequent in the training data. Re-sampling is a classical and widely used
approach for addressing class imbalance issues. Unfortunately, recent studies
claim that re-sampling brings negligible performance improvements in modern
long-tail learning tasks. This paper aims to investigate this phenomenon
systematically. Our research shows that re-sampling can considerably improve
generalization when the training images do not contain semantically irrelevant
contexts. In other scenarios, however, it can learn unexpected spurious
correlations between irrelevant contexts and target labels. We design
experiments on two homogeneous datasets, one containing irrelevant context and
the other not, to confirm our findings. To prevent the learning of spurious
correlations, we propose a new context shift augmentation module that generates
diverse training images for the tail class by maintaining a context bank
extracted from the head-class images. Experiments demonstrate that our proposed
module can boost the generalization and outperform other approaches, including
class-balanced re-sampling, decoupled classifier re-training, and data
augmentation methods. The source code is available at
https://www.lamda.nju.edu.cn/code_CSA.ashx.
Related papers
- Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance.
We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes.
To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Adjusting Logit in Gaussian Form for Long-Tailed Visual Recognition [37.62659619941791]
We study the problem of long-tailed visual recognition from the perspective of feature level.
Two novel logit adjustment methods are proposed to improve model performance at a modest computational overhead.
Experiments conducted on benchmark datasets demonstrate the superior performance of the proposed method over the state-of-the-art ones.
arXiv Detail & Related papers (2023-05-18T02:06:06Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Feature Generation for Long-tail Classification [36.186909933006675]
We show how to generate meaningful features by estimating the tail category's distribution.
We also present a qualitative analysis of generated features using t-SNE visualizations and analyze the nearest neighbors used to calibrate the tail class distributions.
arXiv Detail & Related papers (2021-11-10T21:34:29Z) - Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed
Recognition [95.93760490301395]
The problem of long-tailed recognition, where the number of examples per class is highly unbalanced, is considered.
It is hypothesized that this is due to the repeated sampling of examples and can be addressed by feature space augmentation.
A new feature augmentation strategy, EMANATE, based on back-tracking of features across epochs during training, is proposed.
A new sampling procedure, Breadcrumb, is then introduced to implement adversarial class-balanced sampling without extra computation.
arXiv Detail & Related papers (2021-05-01T00:21:26Z) - The Devil is the Classifier: Investigating Long Tail Relation
Classification with Decoupling Analysis [36.298869931803836]
Long-tailed relation classification is a challenging problem as the head classes may dominate the training phase.
We propose a robust classifier with attentive relation routing, which assigns soft weights by automatically aggregating the relations.
arXiv Detail & Related papers (2020-09-15T12:47:00Z) - Minority Class Oversampling for Tabular Data with Deep Generative Models [4.976007156860967]
We study the ability of deep generative models to provide realistic samples that improve performance on imbalanced classification tasks via oversampling.
Our experiments show that the way the method of sampling does not affect quality, but runtime varies widely.
We also observe that the improvements in terms of performance metric, while shown to be significant, often are minor in absolute terms.
arXiv Detail & Related papers (2020-05-07T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.