VaB-AL: Incorporating Class Imbalance and Difficulty with Variational
Bayes for Active Learning
- URL: http://arxiv.org/abs/2003.11249v2
- Date: Thu, 3 Dec 2020 12:18:11 GMT
- Title: VaB-AL: Incorporating Class Imbalance and Difficulty with Variational
Bayes for Active Learning
- Authors: Jongwon Choi, Kwang Moo Yi, Jihoon Kim, Jinho Choo, Byoungjip Kim,
Jin-Yeop Chang, Youngjune Gwon, Hyung Jin Chang
- Abstract summary: We propose a method that can naturally incorporate class imbalance into the Active Learning framework.
We show that our method can be applied to tasks classification on multiple different datasets.
- Score: 38.33920705605981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active Learning for discriminative models has largely been studied with the
focus on individual samples, with less emphasis on how classes are distributed
or which classes are hard to deal with. In this work, we show that this is
harmful. We propose a method based on the Bayes' rule, that can naturally
incorporate class imbalance into the Active Learning framework. We derive that
three terms should be considered together when estimating the probability of a
classifier making a mistake for a given sample; i) probability of mislabelling
a class, ii) likelihood of the data given a predicted class, and iii) the prior
probability on the abundance of a predicted class. Implementing these terms
requires a generative model and an intractable likelihood estimation.
Therefore, we train a Variational Auto Encoder (VAE) for this purpose. To
further tie the VAE with the classifier and facilitate VAE training, we use the
classifiers' deep feature representations as input to the VAE. By considering
all three probabilities, among them especially the data imbalance, we can
substantially improve the potential of existing methods under limited data
budget. We show that our method can be applied to classification tasks on
multiple different datasets -- including one that is a real-world dataset with
heavy data imbalance -- significantly outperforming the state of the art.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - Mutual Exclusive Modulator for Long-Tailed Recognition [12.706961256329572]
Long-tailed recognition is the task of learning high-performance classifiers given extremely imbalanced training samples between categories.
We introduce a mutual exclusive modulator which can estimate the probability of an image belonging to each group.
Our method achieves competitive performance compared to the state-of-the-art benchmarks.
arXiv Detail & Related papers (2023-02-19T07:31:49Z) - Class-Imbalanced Complementary-Label Learning via Weighted Loss [8.934943507699131]
Complementary-label learning (CLL) is widely used in weakly supervised classification.
It faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples.
We propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification.
arXiv Detail & Related papers (2022-09-28T16:02:42Z) - Learning to Adapt Classifier for Imbalanced Semi-supervised Learning [38.434729550279116]
Pseudo-labeling has proven to be a promising semi-supervised learning (SSL) paradigm.
Existing pseudo-labeling methods commonly assume that the class distributions of training data are balanced.
In this work, we investigate pseudo-labeling under imbalanced semi-supervised setups.
arXiv Detail & Related papers (2022-07-28T02:15:47Z) - Learning from Multiple Unlabeled Datasets with Partial Risk
Regularization [80.54710259664698]
In this paper, we aim to learn an accurate classifier without any class labels.
We first derive an unbiased estimator of the classification risk that can be estimated from the given unlabeled sets.
We then find that the classifier obtained as such tends to cause overfitting as its empirical risks go negative during training.
Experiments demonstrate that our method effectively mitigates overfitting and outperforms state-of-the-art methods for learning from multiple unlabeled sets.
arXiv Detail & Related papers (2022-07-04T16:22:44Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification [11.125446871030734]
Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes.
We propose a simple and general-purpose evaluation framework for imbalanced data classification that is sensitive to arbitrary skews in class cardinalities and importances.
arXiv Detail & Related papers (2020-10-12T19:47:09Z) - Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition
from a Domain Adaptation Perspective [98.70226503904402]
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions.
We propose to augment the classic class-balanced learning by explicitly estimating the differences between the class-conditioned distributions with a meta-learning approach.
arXiv Detail & Related papers (2020-03-24T11:28:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.