Active Learning under Label Shift
- URL: http://arxiv.org/abs/2007.08479v3
- Date: Thu, 25 Feb 2021 20:38:03 GMT
- Title: Active Learning under Label Shift
- Authors: Eric Zhao, Anqi Liu, Animashree Anandkumar, Yisong Yue
- Abstract summary: We introduce a "medial distribution" to incorporate a tradeoff between importance and class-balanced sampling.
We prove sample complexity and generalization guarantees for Mediated Active Learning under Label Shift (MALLS)
We empirically demonstrate MALLS scales to high-dimensional datasets and can reduce the sample complexity of active learning by 60% in deep active learning tasks.
- Score: 80.65643075952639
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of active learning under label shift: when the class
proportions of source and target domains differ. We introduce a "medial
distribution" to incorporate a tradeoff between importance weighting and
class-balanced sampling and propose their combined usage in active learning.
Our method is known as Mediated Active Learning under Label Shift (MALLS). It
balances the bias from class-balanced sampling and the variance from importance
weighting. We prove sample complexity and generalization guarantees for MALLS
which show active learning reduces asymptotic sample complexity even under
arbitrary label shift. We empirically demonstrate MALLS scales to
high-dimensional datasets and can reduce the sample complexity of active
learning by 60% in deep active learning tasks.
Related papers
- CLAF: Contrastive Learning with Augmented Features for Imbalanced
Semi-Supervised Learning [40.5117833362268]
Semi-supervised learning and contrastive learning have been progressively combined to achieve better performances in popular applications.
One common manner is assigning pseudo-labels to unlabeled samples and selecting positive and negative samples from pseudo-labeled samples to apply contrastive learning.
We propose Contrastive Learning with Augmented Features (CLAF) to alleviate the scarcity of minority class samples in contrastive learning.
arXiv Detail & Related papers (2023-12-15T08:27:52Z) - DIRECT: Deep Active Learning under Imbalance and Label Noise [15.571923343398657]
We conduct the first study of active learning under both class imbalance and label noise.
We propose a novel algorithm that robustly identifies the class separation threshold and annotates the most uncertain examples.
Our results demonstrate that DIRECT can save more than 60% of the annotation budget compared to state-of-art active learning algorithms.
arXiv Detail & Related papers (2023-12-14T18:18:34Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Active Learning with Combinatorial Coverage [0.0]
Active learning is a practical field of machine learning that automates the process of selecting which data to label.
Current methods are effective in reducing the burden of data labeling but are heavily model-reliant.
This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias.
We propose active learning methods utilizing coverage to overcome these issues.
arXiv Detail & Related papers (2023-02-28T13:43:23Z) - One Positive Label is Sufficient: Single-Positive Multi-Label Learning
with Label Enhancement [71.9401831465908]
We investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label.
A novel method named proposed, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed.
Experiments on benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-06-01T14:26:30Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - An analysis of over-sampling labeled data in semi-supervised learning
with FixMatch [66.34968300128631]
Most semi-supervised learning methods over-sample labeled data when constructing training mini-batches.
This paper studies whether this common practice improves learning and how.
We compare it to an alternative setting where each mini-batch is uniformly sampled from all the training data, labeled or not.
arXiv Detail & Related papers (2022-01-03T12:22:26Z) - Reducing Label Effort: Self-Supervised meets Active Learning [32.4747118398236]
Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets.
Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort.
The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.
arXiv Detail & Related papers (2021-08-25T20:04:44Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.