The Equalization Losses: Gradient-Driven Training for Long-tailed Object
Recognition
- URL: http://arxiv.org/abs/2210.05566v1
- Date: Tue, 11 Oct 2022 16:00:36 GMT
- Title: The Equalization Losses: Gradient-Driven Training for Long-tailed Object
Recognition
- Authors: Jingru Tan, Bo Li, Xin Lu, Yongqiang Yao, Fengwei Yu, Tong He, Wanli
Ouyang
- Abstract summary: We propose a gradient-driven training mechanism to tackle the long-tail problem.
We introduce a new family of gradient-driven loss functions, namely equalization losses.
Our method consistently outperforms the baseline models.
- Score: 84.51875325962061
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-tail distribution is widely spread in real-world applications. Due to
the extremely small ratio of instances, tail categories often show inferior
accuracy. In this paper, we find such performance bottleneck is mainly caused
by the imbalanced gradients, which can be categorized into two parts: (1)
positive part, deriving from the samples of the same category, and (2) negative
part, contributed by other categories. Based on comprehensive experiments, it
is also observed that the gradient ratio of accumulated positives to negatives
is a good indicator to measure how balanced a category is trained. Inspired by
this, we come up with a gradient-driven training mechanism to tackle the
long-tail problem: re-balancing the positive/negative gradients dynamically
according to current accumulative gradients, with a unified goal of achieving
balance gradient ratios. Taking advantage of the simple and flexible gradient
mechanism, we introduce a new family of gradient-driven loss functions, namely
equalization losses. We conduct extensive experiments on a wide spectrum of
visual tasks, including two-stage/single-stage long-tailed object detection
(LVIS), long-tailed image classification (ImageNet-LT, Places-LT, iNaturalist),
and long-tailed semantic segmentation (ADE20K). Our method consistently
outperforms the baseline models, demonstrating the effectiveness and
generalization ability of the proposed equalization losses. Codes will be
released at https://github.com/ModelTC/United-Perception.
Related papers
- Gradient-Aware Logit Adjustment Loss for Long-tailed Classifier [30.931850375858573]
In the real-world setting, data often follows a long-tailed distribution, where head classes contain significantly more training samples than tail classes.
We propose the Gradient-Aware Logit Adjustment (GALA) loss, which adjusts the logits based on accumulated gradients to balance the optimization process.
Our approach achieves top-1 accuracy of 48.5%, 41.4%, and 73.3% on popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2024-03-14T02:21:01Z) - Decoupled Contrastive Learning for Long-Tailed Recognition [58.255966442426484]
Supervised Contrastive Loss (SCL) is popular in visual representation learning.
In the scenario of long-tailed recognition, where the number of samples in each class is imbalanced, treating two types of positive samples equally leads to the biased optimization for intra-category distance.
We propose a patch-based self distillation to transfer knowledge from head to tail classes to relieve the under-representation of tail classes.
arXiv Detail & Related papers (2024-03-10T09:46:28Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Long-Tailed Learning as Multi-Objective Optimization [29.012779934262973]
We argue that the seesaw dilemma is derived from gradient imbalance of different classes.
We propose a Gradient-Balancing Grouping (GBG) strategy to gather the classes with similar gradient directions.
arXiv Detail & Related papers (2023-10-31T14:30:31Z) - Theoretical Characterization of How Neural Network Pruning Affects its
Generalization [131.1347309639727]
This work makes the first attempt to study how different pruning fractions affect the model's gradient descent dynamics and generalization.
It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero.
More surprisingly, the generalization bound gets better as the pruning fraction gets larger.
arXiv Detail & Related papers (2023-01-01T03:10:45Z) - Learning Compact Features via In-Training Representation Alignment [19.273120635948363]
In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set.
We propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss.
We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning.
arXiv Detail & Related papers (2022-11-23T22:23:22Z) - A Theoretical Analysis of the Learning Dynamics under Class Imbalance [0.10231119246773925]
We show that the learning curves for minority and majority classes follow sub-optimal trajectories when training with a gradient-based trajectory.
This slowdown is related to the imbalance ratio and can be traced back to a competition between the optimization of different classes.
We find that GD is not guaranteed to decrease the loss for each class but that this problem can be addressed by performing a per-class normalization of the gradient.
arXiv Detail & Related papers (2022-07-01T12:54:38Z) - Seesaw Loss for Long-Tailed Instance Segmentation [131.86306953253816]
We propose Seesaw Loss to dynamically re-balance gradients of positive and negative samples for each category.
The mitigation factor reduces punishments to tail categories w.r.t. the ratio of cumulative training instances between different categories.
The compensation factor increases the penalty of misclassified instances to avoid false positives of tail categories.
arXiv Detail & Related papers (2020-08-23T12:44:45Z) - Equalization Loss for Long-Tailed Object Recognition [109.91045951333835]
State-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets.
We propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories.
Our method achieves AP gains of 4.1% and 4.8% for the rare and common categories on the challenging LVIS benchmark.
arXiv Detail & Related papers (2020-03-11T09:14:53Z) - Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks.
We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.