Bidirectional Uncertainty-Based Active Learning for Open Set Annotation
- URL: http://arxiv.org/abs/2402.15198v2
- Date: Sun, 7 Jul 2024 03:48:33 GMT
- Title: Bidirectional Uncertainty-Based Active Learning for Open Set Annotation
- Authors: Chen-Chen Zong, Ye-Wen Wang, Kun-Peng Ning, Hai-Bo Ye, Sheng-Jun Huang,
- Abstract summary: Active learning in open set scenarios presents a novel challenge of identifying the most valuable examples.
Traditional methods prioritize selecting informative examples with low confidence.
Recent methods favor the most probable known-class examples.
We propose a Bidirectional Uncertainty-based Active Learning framework.
- Score: 28.624321743629825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Active learning (AL) in open set scenarios presents a novel challenge of identifying the most valuable examples in an unlabeled data pool that comprises data from both known and unknown classes. Traditional methods prioritize selecting informative examples with low confidence, with the risk of mistakenly selecting unknown-class examples with similarly low confidence. Recent methods favor the most probable known-class examples, with the risk of picking simple already mastered examples. In this paper, we attempt to query examples that are both likely from known classes and highly informative, and propose a Bidirectional Uncertainty-based Active Learning (BUAL) framework. Specifically, we achieve this by first pushing the unknown class examples toward regions with high-confidence predictions, i.e., the proposed Random Label Negative Learning method. Then, we propose a Bidirectional Uncertainty sampling strategy by jointly estimating uncertainty posed by both positive and negative learning to perform consistent and stable sampling. BUAL successfully extends existing uncertainty-based AL methods to complex open-set scenarios. Extensive experiments on multiple datasets with varying openness demonstrate that BUAL achieves state-of-the-art performance. The code is available at https://github.com/chenchenzong/BUAL.
Related papers
- Dirichlet-Based Coarse-to-Fine Example Selection For Open-Set Annotation [37.33424244520009]
We propose a Dirichlet-based Coarse-to-Fine Example Selection (DCFS) strategy.
Our method introduces simplex-based evidential deep learning (EDL) to break translation invariance.
Experiments on various openness ratio datasets demonstrate that DCFS achieves state-of-art performance.
arXiv Detail & Related papers (2024-09-26T07:47:50Z) - Entropic Open-set Active Learning [30.91182106918793]
Active Learning (AL) aims to enhance the performance of deep models by selecting the most informative samples for annotation from a pool of unlabeled data.
Despite impressive performance in closed-set settings, most AL methods fail in real-world scenarios where the unlabeled data contains unknown categories.
We propose an Entropic Open-set AL framework which leverages both known and unknown distributions effectively to select informative samples during AL rounds.
arXiv Detail & Related papers (2023-12-21T18:47:12Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - Overcoming Overconfidence for Active Learning [1.2776312584227847]
We present two novel methods to address the problem of overconfidence that arises in the active learning scenario.
The first is an augmentation strategy named Cross-Mix-and-Mix (CMaM), which aims to calibrate the model by expanding the limited training distribution.
The second is a selection strategy named Ranked Margin Sampling (RankedMS), which prevents choosing data that leads to overly confident predictions.
arXiv Detail & Related papers (2023-08-21T09:04:54Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Uncertainty-Aware Bootstrap Learning for Joint Extraction on
Distantly-Supervised Data [36.54640096189285]
bootstrap learning is motivated by the intuition that the higher uncertainty of an instance, the more likely the model confidence is inconsistent with the ground truths.
We first explore instance-level data uncertainty to create an initial high-confident examples.
During bootstrap learning, we propose self-ensembling as a regularizer to alleviate inter-model uncertainty produced by noisy labels.
arXiv Detail & Related papers (2023-05-05T20:06:11Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.