Deep Active Learning via Open Set Recognition
- URL: http://arxiv.org/abs/2007.02196v4
- Date: Mon, 5 Apr 2021 18:47:17 GMT
- Title: Deep Active Learning via Open Set Recognition
- Authors: Jaya Krishna Mandivarapu, Blake Camp, Rolando Estrada
- Abstract summary: In many applications, data is easy to acquire but expensive and time-consuming to label prominent examples.
We formulate active learning as an open-set recognition problem.
Unlike current active learning methods, our algorithm can learn tasks without the need for task labels.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many applications, data is easy to acquire but expensive and
time-consuming to label prominent examples include medical imaging and NLP.
This disparity has only grown in recent years as our ability to collect data
improves. Under these constraints, it makes sense to select only the most
informative instances from the unlabeled pool and request an oracle (e.g., a
human expert) to provide labels for those samples. The goal of active learning
is to infer the informativeness of unlabeled samples so as to minimize the
number of requests to the oracle. Here, we formulate active learning as an
open-set recognition problem. In this paradigm, only some of the inputs belong
to known classes; the classifier must identify the rest as unknown. More
specifically, we leverage variational neural networks (VNNs), which produce
high-confidence (i.e., low-entropy) predictions only for inputs that closely
resemble the training data. We use the inverse of this confidence measure to
select the samples that the oracle should label. Intuitively, unlabeled samples
that the VNN is uncertain about are more informative for future training. We
carried out an extensive evaluation of our novel, probabilistic formulation of
active learning, achieving state-of-the-art results on MNIST, CIFAR-10, and
CIFAR-100. Additionally, unlike current active learning methods, our algorithm
can learn tasks without the need for task labels. As our experiments show, when
the unlabeled pool consists of a mixture of samples from multiple datasets, our
approach can automatically distinguish between samples from seen vs. unseen
tasks.
Related papers
- CUAL: Continual Uncertainty-aware Active Learner [5.678185894553588]
A deployed AI agent is continuously provided with unlabeled data that may contain not only unseen samples of known classes but also samples from novel (unknown) classes.
We present a comprehensive solution to this complex problem with our model "CUAL" (Continual Uncertainty-aware Active Learner)
CUAL leverages an uncertainty estimation algorithm to prioritize active labeling of ambiguous (uncertain) predicted novel class samples while also simultaneously pseudo-labeling the most certain predictions of each class.
arXiv Detail & Related papers (2024-12-12T19:49:09Z) - Class Balance Matters to Active Class-Incremental Learning [61.11786214164405]
We aim to start from a pool of large-scale unlabeled data and then annotate the most informative samples for incremental learning.
We propose Class-Balanced Selection (CBS) strategy to achieve both class balance and informativeness in chosen samples.
Our CBS can be plugged and played into those CIL methods which are based on pretrained models with prompts tunning technique.
arXiv Detail & Related papers (2024-12-09T16:37:27Z) - Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning.
One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems.
We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling [6.771578432805963]
Active learning aims to alleviate the amount of labor involved in data labeling by automating the selection of unlabeled samples.
We introduce novel techniques that significantly improve the use of abundant unlabeled data during training.
We demonstrate the superior performance of our approach over the state of the art on various image classification and segmentation benchmark datasets.
arXiv Detail & Related papers (2024-08-23T00:35:07Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - MyriadAL: Active Few Shot Learning for Histopathology [10.652626309100889]
We introduce an active few shot learning framework, Myriad Active Learning (MAL)
MAL includes a contrastive-learning encoder, pseudo-label generation, and novel query sample selection in the loop.
Experiments on two public histopathology datasets show that MAL has superior test accuracy, macro F1-score, and label efficiency compared to prior works.
arXiv Detail & Related papers (2023-10-24T20:08:15Z) - Deep Active Learning with Contrastive Learning Under Realistic Data Pool
Assumptions [2.578242050187029]
Active learning aims to identify the most informative data from an unlabeled data pool that enables a model to reach the desired accuracy rapidly.
Most existing active learning methods have been evaluated in an ideal setting where only samples relevant to the target task exist in an unlabeled data pool.
We introduce new active learning benchmarks that include ambiguous, task-irrelevant out-of-distribution as well as in-distribution samples.
arXiv Detail & Related papers (2023-03-25T10:46:10Z) - MoBYv2AL: Self-supervised Active Learning for Image Classification [57.4372176671293]
We present MoBYv2AL, a novel self-supervised active learning framework for image classification.
Our contribution lies in lifting MoBY, one of the most successful self-supervised learning algorithms, to the AL pipeline.
We achieve state-of-the-art results when compared to recent AL methods.
arXiv Detail & Related papers (2023-01-04T10:52:02Z) - Learning to Imagine: Diversify Memory for Incremental Learning using
Unlabeled Data [69.30452751012568]
We develop a learnable feature generator to diversify exemplars by adaptively generating diverse counterparts of exemplars.
We introduce semantic contrastive learning to enforce the generated samples to be semantic consistent with exemplars.
Our method does not bring any extra inference cost and outperforms state-of-the-art methods on two benchmarks.
arXiv Detail & Related papers (2022-04-19T15:15:18Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - PAL : Pretext-based Active Learning [2.869739951301252]
We propose an active learning technique for deep neural networks that is more robust to mislabeling than the previously proposed techniques.
We use a separate network to score the unlabeled samples for selection.
The resultant technique also produces competitive accuracy in the absence of label noise.
arXiv Detail & Related papers (2020-10-29T21:16:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.