Active Class Incremental Learning for Imbalanced Datasets
- URL: http://arxiv.org/abs/2008.10968v1
- Date: Tue, 25 Aug 2020 12:47:09 GMT
- Title: Active Class Incremental Learning for Imbalanced Datasets
- Authors: Eden Belouadah, Adrian Popescu, Umang Aggarwal, L\'eo Saci
- Abstract summary: Incremental Learning (IL) allows AI systems to adapt to streamed data.
Most existing algorithms make two strong hypotheses which reduce the realism of the incremental scenario.
We introduce sample acquisition functions which tackle imbalance and are compatible with IL constraints.
- Score: 10.680349952226935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incremental Learning (IL) allows AI systems to adapt to streamed data. Most
existing algorithms make two strong hypotheses which reduce the realism of the
incremental scenario: (1) new data are assumed to be readily annotated when
streamed and (2) tests are run with balanced datasets while most real-life
datasets are actually imbalanced. These hypotheses are discarded and the
resulting challenges are tackled with a combination of active and imbalanced
learning. We introduce sample acquisition functions which tackle imbalance and
are compatible with IL constraints. We also consider IL as an imbalanced
learning problem instead of the established usage of knowledge distillation
against catastrophic forgetting. Here, imbalance effects are reduced during
inference through class prediction scaling. Evaluation is done with four visual
datasets and compares existing and proposed sample acquisition functions.
Results indicate that the proposed contributions have a positive effect and
reduce the gap between active and standard IL performance.
Related papers
- FILM: Framework for Imbalanced Learning Machines based on a new unbiased performance measure and a new ensemble-based technique [37.94431794242543]
This research addresses the challenges of handling unbalanced datasets for binary classification tasks.
Standard evaluation metrics are often biased by the disproportionate representation of the minority class.
We propose a novel metric, the Unbiased Integration Coefficients, which exhibits significantly reduced bias.
arXiv Detail & Related papers (2025-03-06T12:15:56Z) - Exploring Imbalanced Annotations for Effective In-Context Learning [41.618125904839424]
We show that imbalanced class distributions in annotated datasets significantly degrade the performance of in-context learning (ICL)
Our method is motivated by decomposing the distributional differences between annotated and test datasets into two-component weights.
Our approach can prevent selecting too many demonstrations from a single class while preserving the effectiveness of the original selection methods.
arXiv Detail & Related papers (2025-02-06T12:57:50Z) - Conformal-in-the-Loop for Learning with Imbalanced Noisy Data [5.69777817429044]
Class imbalance and label noise are pervasive in large-scale datasets.
Much of machine learning research assumes well-labeled, balanced data, which rarely reflects real world conditions.
We propose Conformal-in-the-Loop (CitL), a novel training framework that addresses both challenges with a conformal prediction-based approach.
arXiv Detail & Related papers (2024-11-04T17:09:58Z) - Ultra-imbalanced classification guided by statistical information [24.969543903532664]
We take a population-level approach to imbalanced learning by proposing a new formulation called emphultra-imbalanced classification (UIC)
Under UIC, loss functions behave differently even if infinite amount of training samples are available.
A novel learning objective termed Tunable Boosting Loss is developed which is provably resistant against data imbalance under UIC.
arXiv Detail & Related papers (2024-09-06T08:07:09Z) - Gradient Reweighting: Towards Imbalanced Class-Incremental Learning [8.438092346233054]
Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data.
A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution.
We show that this dual imbalance issue causes skewed gradient updates with biased weights in FC layers, thus inducing over/under-fitting and catastrophic forgetting in CIL.
arXiv Detail & Related papers (2024-02-28T18:08:03Z) - Class-Imbalanced Graph Learning without Class Rebalancing [62.1368829847041]
Class imbalance is prevalent in real-world node classification tasks and poses great challenges for graph learning models.
In this work, we approach the root cause of class-imbalance bias from an topological paradigm.
We devise a lightweight topological augmentation framework BAT to mitigate the class-imbalance bias without class rebalancing.
arXiv Detail & Related papers (2023-08-27T19:01:29Z) - How To Overcome Confirmation Bias in Semi-Supervised Image
Classification By Active Learning [2.1805442504863506]
We present three data challenges common in real-world applications: between-class imbalance, within-class imbalance, and between-class similarity.
We find that random sampling does not mitigate confirmation bias and, in some cases, leads to worse performance than supervised learning.
Our results provide insights into the potential of combining active and semi-supervised learning in the presence of common real-world challenges.
arXiv Detail & Related papers (2023-08-16T08:52:49Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - Minority Class Oriented Active Learning for Imbalanced Datasets [6.009262446889319]
We introduce a new active learning method which is designed for imbalanced datasets.
It favors samples likely to be in minority classes so as to reduce the imbalance of the labeled subset.
We also compare two training schemes for active learning.
arXiv Detail & Related papers (2022-02-01T13:13:41Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z) - Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers.
Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.