Related papers: Active Learning under Label Shift

Active Learning under Label Shift

URL: http://arxiv.org/abs/2007.08479v3
Date: Thu, 25 Feb 2021 20:38:03 GMT
Title: Active Learning under Label Shift
Authors: Eric Zhao, Anqi Liu, Animashree Anandkumar, Yisong Yue
Abstract summary: We introduce a "medial distribution" to incorporate a tradeoff between importance and class-balanced sampling. We prove sample complexity and generalization guarantees for Mediated Active Learning under Label Shift (MALLS) We empirically demonstrate MALLS scales to high-dimensional datasets and can reduce the sample complexity of active learning by 60% in deep active learning tasks.
Score: 80.65643075952639
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the problem of active learning under label shift: when the class proportions of source and target domains differ. We introduce a "medial distribution" to incorporate a tradeoff between importance weighting and class-balanced sampling and propose their combined usage in active learning. Our method is known as Mediated Active Learning under Label Shift (MALLS). It balances the bias from class-balanced sampling and the variance from importance weighting. We prove sample complexity and generalization guarantees for MALLS which show active learning reduces asymptotic sample complexity even under arbitrary label shift. We empirically demonstrate MALLS scales to high-dimensional datasets and can reduce the sample complexity of active learning by 60% in deep active learning tasks.

Related papers

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning [51.177789437682954]
Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones. Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown. We propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.
arXiv Detail & Related papers (2025-02-11T13:57:30Z)
Class Balance Matters to Active Class-Incremental Learning [61.11786214164405]
We aim to start from a pool of large-scale unlabeled data and then annotate the most informative samples for incremental learning. We propose Class-Balanced Selection (CBS) strategy to achieve both class balance and informativeness in chosen samples. Our CBS can be plugged and played into those CIL methods which are based on pretrained models with prompts tunning technique.
arXiv Detail & Related papers (2024-12-09T16:37:27Z)
Multi-Label Bayesian Active Learning with Inter-Label Relationships [3.88369051454137]
We propose a new multi-label active learning strategy to address both challenges. Our method incorporates progressively updated positive and negative correlation matrices to capture co-occurrence and disjoint relationships. Our strategy consistently achieves more reliable and superior performance, compared to several established methods.
arXiv Detail & Related papers (2024-11-26T23:28:54Z)
CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised Learning [40.5117833362268]
Semi-supervised learning and contrastive learning have been progressively combined to achieve better performances in popular applications. One common manner is assigning pseudo-labels to unlabeled samples and selecting positive and negative samples from pseudo-labeled samples to apply contrastive learning. We propose Contrastive Learning with Augmented Features (CLAF) to alleviate the scarcity of minority class samples in contrastive learning.
arXiv Detail & Related papers (2023-12-15T08:27:52Z)
DIRECT: Deep Active Learning under Imbalance and Label Noise [15.571923343398657]
We conduct the first study of active learning under both class imbalance and label noise. We propose a novel algorithm that robustly identifies the class separation threshold and annotates the most uncertain examples. Our results demonstrate that DIRECT can save more than 60% of the annotation budget compared to state-of-art active learning algorithms.
arXiv Detail & Related papers (2023-12-14T18:18:34Z)
Virtual Category Learning: A Semi-Supervised Learning Method for Dense Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction. A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation. Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z)
Active Learning with Combinatorial Coverage [0.0]
Active learning is a practical field of machine learning that automates the process of selecting which data to label. Current methods are effective in reducing the burden of data labeling but are heavily model-reliant. This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias. We propose active learning methods utilizing coverage to overcome these issues.
arXiv Detail & Related papers (2023-02-28T13:43:23Z)
One Positive Label is Sufficient: Single-Positive Multi-Label Learning with Label Enhancement [71.9401831465908]
We investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label. A novel method named proposed, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed. Experiments on benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-06-01T14:26:30Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
An analysis of over-sampling labeled data in semi-supervised learning with FixMatch [66.34968300128631]
Most semi-supervised learning methods over-sample labeled data when constructing training mini-batches. This paper studies whether this common practice improves learning and how. We compare it to an alternative setting where each mini-batch is uniformly sampled from all the training data, labeled or not.
arXiv Detail & Related papers (2022-01-03T12:22:26Z)
Reducing Label Effort: Self-Supervised meets Active Learning [32.4747118398236]
Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort. The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.
arXiv Detail & Related papers (2021-08-25T20:04:44Z)
Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels. We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.