Meta Feature Modulator for Long-tailed Recognition
- URL: http://arxiv.org/abs/2008.03428v1
- Date: Sat, 8 Aug 2020 03:19:03 GMT
- Title: Meta Feature Modulator for Long-tailed Recognition
- Authors: Renzhen Wang, Kaiqin Hu, Yanwen Zhu, Jun Shu, Qian Zhao, Deyu Meng
- Abstract summary: We propose a meta-learning framework to model the difference between the long-tailed training data and the balanced meta data from the perspective of representation learning.
We further design a modulator network to guide the generation of the modulation parameters, and such a meta-learner can be readily adapted to train the classification network on other long-tailed datasets.
- Score: 37.90990378643794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks often degrade significantly when training data suffer
from class imbalance problems. Existing approaches, e.g., re-sampling and
re-weighting, commonly address this issue by rearranging the label distribution
of training data to train the networks fitting well to the implicit balanced
label distribution. However, most of them hinder the representative ability of
learned features due to insufficient use of intra/inter-sample information of
training data. To address this issue, we propose meta feature modulator (MFM),
a meta-learning framework to model the difference between the long-tailed
training data and the balanced meta data from the perspective of representation
learning. Concretely, we employ learnable hyper-parameters (dubbed modulation
parameters) to adaptively scale and shift the intermediate features of
classification networks, and the modulation parameters are optimized together
with the classification network parameters guided by a small amount of balanced
meta data. We further design a modulator network to guide the generation of the
modulation parameters, and such a meta-learner can be readily adapted to train
the classification network on other long-tailed datasets. Extensive experiments
on benchmark vision datasets substantiate the superiority of our approach on
long-tailed recognition tasks beyond other state-of-the-art methods.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Unsupervised Representation Learning to Aid Semi-Supervised Meta
Learning [16.534014215010757]
We propose a one-shot unsupervised meta-learning to learn latent representation of training samples.
A temperature-scaled cross-entropy loss is used in the inner loop of meta-learning to prevent overfitting.
The proposed method is model agnostic and can aid any meta-learning model to improve accuracy.
arXiv Detail & Related papers (2023-10-19T18:25:22Z) - Retrieval-Augmented Meta Learning for Low-Resource Text Classification [22.653220906899612]
We propose a meta-learning based method called Retrieval-Augmented Meta Learning(RAML)
It uses parameterization for inference but also retrieves non-parametric knowledge from an external corpus to make inferences.
RAML significantly outperforms current SOTA low-resource text classification models.
arXiv Detail & Related papers (2023-09-10T10:05:03Z) - Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data.
We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes.
This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - An Optimization-Based Meta-Learning Model for MRI Reconstruction with
Diverse Dataset [4.9259403018534496]
We develop a generalizable MRI reconstruction model in the meta-learning framework.
The proposed network learns regularization function in a learner adaptional model.
We test the result of quick training on the unseen tasks after meta-training and in the saving half of the time.
arXiv Detail & Related papers (2021-10-02T03:21:52Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.