Balanced Meta-Softmax for Long-Tailed Visual Recognition
- URL: http://arxiv.org/abs/2007.10740v3
- Date: Sun, 22 Nov 2020 05:27:41 GMT
- Title: Balanced Meta-Softmax for Long-Tailed Visual Recognition
- Authors: Jiawei Ren, Cunjun Yu, Shunan Sheng, Xiao Ma, Haiyu Zhao, Shuai Yi,
Hongsheng Li
- Abstract summary: We show that the Softmax function, though used in most classification tasks, gives a biased gradient estimation under the long-tailed setup.
This paper presents Balanced Softmax, an elegant unbiased extension of Softmax, to accommodate the label distribution shift between training and testing.
In our experiments, we demonstrate that Balanced Meta-Softmax outperforms state-of-the-art long-tailed classification solutions on both visual recognition and instance segmentation tasks.
- Score: 46.215759445665434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep classifiers have achieved great success in visual recognition. However,
real-world data is long-tailed by nature, leading to the mismatch between
training and testing distributions. In this paper, we show that the Softmax
function, though used in most classification tasks, gives a biased gradient
estimation under the long-tailed setup. This paper presents Balanced Softmax,
an elegant unbiased extension of Softmax, to accommodate the label distribution
shift between training and testing. Theoretically, we derive the generalization
bound for multiclass Softmax regression and show our loss minimizes the bound.
In addition, we introduce Balanced Meta-Softmax, applying a complementary Meta
Sampler to estimate the optimal class sample rate and further improve
long-tailed learning. In our experiments, we demonstrate that Balanced
Meta-Softmax outperforms state-of-the-art long-tailed classification solutions
on both visual recognition and instance segmentation tasks.
Related papers
- Revisiting Logistic-softmax Likelihood in Bayesian Meta-Learning for Few-Shot Classification [4.813254903898101]
logistic-softmax is often employed as an alternative to the softmax likelihood in multi-class Gaussian process classification.
We revisit and redesign the logistic-softmax likelihood, which enables control of the textita priori confidence level through a temperature parameter.
Our approach yields well-calibrated uncertainty estimates and achieves comparable or superior results on standard benchmark datasets.
arXiv Detail & Related papers (2023-10-16T13:20:13Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - A two-head loss function for deep Average-K classification [8.189630642296416]
We propose a new loss function based on a multi-label classification in addition to the classical softmax.
We show that this approach allows the model to better capture ambiguities between classes and, as a result, to return more consistent sets of possible classes.
arXiv Detail & Related papers (2023-03-31T15:04:53Z) - Spectral Aware Softmax for Visible-Infrared Person Re-Identification [123.69049942659285]
Visible-infrared person re-identification (VI-ReID) aims to match specific pedestrian images from different modalities.
Existing methods still follow the softmax loss training paradigm, which is widely used in single-modality classification tasks.
We propose the spectral-aware softmax (SA-Softmax) loss, which can fully explore the embedding space with the modality information.
arXiv Detail & Related papers (2023-02-03T02:57:18Z) - To Softmax, or not to Softmax: that is the question when applying Active
Learning for Transformer Models [24.43410365335306]
A well known technique to reduce the amount of human effort in acquiring a labeled dataset is textitActive Learning (AL)
This paper compares eight alternatives on seven datasets.
Most of the methods are too good at identifying the true most uncertain samples (outliers) and that labeling exclusively results in worse performance.
arXiv Detail & Related papers (2022-10-06T15:51:39Z) - Real Additive Margin Softmax for Speaker Verification [14.226089039985151]
We show that AM-Softmax loss does not implement real max-margin training.
We present a Real AM-Softmax loss which involves a true margin function in the softmax training.
arXiv Detail & Related papers (2021-10-18T09:11:14Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Balanced Activation for Long-tailed Visual Recognition [13.981652331491558]
We introduce Balanced Activation to accommodate the label distribution shift between training and testing in object detection.
We show that Balanced Activation generally provides 3% gain in terms of mAP on LVIS-1.0 and outperforms the current state-of-the-art methods without introducing any extra parameters.
arXiv Detail & Related papers (2020-08-24T11:36:10Z) - Taming GANs with Lookahead-Minmax [63.90038365274479]
Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient.
Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels.
arXiv Detail & Related papers (2020-06-25T17:13:23Z) - Gradient Estimation with Stochastic Softmax Tricks [84.68686389163153]
We introduce softmax tricks, which generalize the Gumbel-Softmax trick to spaces.
We find that softmax tricks can be used to train latent variable models that perform better and discover more latent structure.
arXiv Detail & Related papers (2020-06-15T00:43:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.