Related papers: Minority Class Oriented Active Learning for Imbalanced Datasets

Minority Class Oriented Active Learning for Imbalanced Datasets

URL: http://arxiv.org/abs/2202.00390v1
Date: Tue, 1 Feb 2022 13:13:41 GMT
Title: Minority Class Oriented Active Learning for Imbalanced Datasets
Authors: Umang Aggarwal, Adrian Popescu, and C\'eline Hudelot
Abstract summary: We introduce a new active learning method which is designed for imbalanced datasets. It favors samples likely to be in minority classes so as to reduce the imbalance of the labeled subset. We also compare two training schemes for active learning.
Score: 6.009262446889319
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Active learning aims to optimize the dataset annotation process when resources are constrained. Most existing methods are designed for balanced datasets. Their practical applicability is limited by the fact that a majority of real-life datasets are actually imbalanced. Here, we introduce a new active learning method which is designed for imbalanced datasets. It favors samples likely to be in minority classes so as to reduce the imbalance of the labeled subset and create a better representation for these classes. We also compare two training schemes for active learning: (1) the one commonly deployed in deep active learning using model fine tuning for each iteration and (2) a scheme which is inspired by transfer learning and exploits generic pre-trained models and train shallow classifiers for each iteration. Evaluation is run with three imbalanced datasets. Results show that the proposed active learning method outperforms competitive baselines. Equally interesting, they also indicate that the transfer learning training scheme outperforms model fine tuning if features are transferable from the generic dataset to the unlabeled one. This last result is surprising and should encourage the community to explore the design of deep active learning methods.

Related papers

Class Balance Matters to Active Class-Incremental Learning [61.11786214164405]
We aim to start from a pool of large-scale unlabeled data and then annotate the most informative samples for incremental learning. We propose Class-Balanced Selection (CBS) strategy to achieve both class balance and informativeness in chosen samples. Our CBS can be plugged and played into those CIL methods which are based on pretrained models with prompts tunning technique.
arXiv Detail & Related papers (2024-12-09T16:37:27Z)
Simplifying Neural Network Training Under Class Imbalance [77.39968702907817]
Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models. The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures. We demonstrate that simply tuning existing components of standard deep learning pipelines, such as the batch size, data augmentation, and label smoothing, can achieve state-of-the-art performance without any such specialized class imbalance methods.
arXiv Detail & Related papers (2023-12-05T05:52:44Z)
One-bit Supervision for Image Classification: Problem, Solution, and Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification. We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm. In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z)
Active Learning with Combinatorial Coverage [0.0]
Active learning is a practical field of machine learning that automates the process of selecting which data to label. Current methods are effective in reducing the burden of data labeling but are heavily model-reliant. This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias. We propose active learning methods utilizing coverage to overcome these issues.
arXiv Detail & Related papers (2023-02-28T13:43:23Z)
Iterative Loop Learning Combining Self-Training and Active Learning for Domain Adaptive Semantic Segmentation [1.827510863075184]
Self-training and active learning have been proposed to alleviate this problem. This paper proposes an iterative loop learning method combining Self-Training and Active Learning.
arXiv Detail & Related papers (2023-01-31T01:31:43Z)
Imbalanced Classification via Explicit Gradient Learning From Augmented Data [0.0]
We propose a novel deep meta-learning technique to augment a given imbalanced dataset with new minority instances. The advantage of the proposed method is demonstrated on synthetic and real-world datasets with various imbalance ratios.
arXiv Detail & Related papers (2022-02-21T22:16:50Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network. Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced. We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z)
Class-Balanced Active Learning for Image Classification [29.5211685759702]
We propose a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods.
arXiv Detail & Related papers (2021-10-09T11:30:26Z)
Class Balancing GAN with a Classifier in the Loop [58.29090045399214]
We introduce a novel theoretically motivated Class Balancing regularizer for training GANs. Our regularizer makes use of the knowledge from a pre-trained classifier to ensure balanced learning of all the classes in the dataset. We demonstrate the utility of our regularizer in learning representations for long-tailed distributions via achieving better performance than existing approaches over multiple datasets.
arXiv Detail & Related papers (2021-06-17T11:41:30Z)
Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution. This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes. Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z)
Sequential Targeting: an incremental learning approach for data imbalance in text classification [7.455546102930911]
Methods to handle imbalanced datasets are crucial for alleviating distributional skews. We propose a novel training method, Sequential Targeting(ST), independent of the effectiveness of the representation method. We demonstrate the effectiveness of our method through experiments on simulated benchmark datasets (IMDB) and data collected from NAVER.
arXiv Detail & Related papers (2020-11-20T04:54:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.