Making Your First Choice: To Address Cold Start Problem in Vision Active
Learning
- URL: http://arxiv.org/abs/2210.02442v1
- Date: Wed, 5 Oct 2022 17:59:50 GMT
- Title: Making Your First Choice: To Address Cold Start Problem in Vision Active
Learning
- Authors: Liangyu Chen, Yutong Bai, Siyu Huang, Yongyi Lu, Bihan Wen, Alan L.
Yuille, Zongwei Zhou
- Abstract summary: Active learning promises to improve annotation efficiency by iteratively selecting the most important data to be annotated first.
We identify this as the cold start problem in vision active learning, caused by a biased and outlier initial query.
This paper seeks to address the cold start problem by exploiting the three advantages of contrastive learning.
- Score: 90.24315238412407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning promises to improve annotation efficiency by iteratively
selecting the most important data to be annotated first. However, we uncover a
striking contradiction to this promise: active learning fails to select data as
efficiently as random selection at the first few choices. We identify this as
the cold start problem in vision active learning, caused by a biased and
outlier initial query. This paper seeks to address the cold start problem by
exploiting the three advantages of contrastive learning: (1) no annotation is
required; (2) label diversity is ensured by pseudo-labels to mitigate bias; (3)
typical data is determined by contrastive features to reduce outliers.
Experiments are conducted on CIFAR-10-LT and three medical imaging datasets
(i.e. Colon Pathology, Abdominal CT, and Blood Cell Microscope). Our initial
query not only significantly outperforms existing active querying strategies
but also surpasses random selection by a large margin. We foresee our solution
to the cold start problem as a simple yet strong baseline to choose the initial
query for vision active learning. Code is available:
https://github.com/c-liangyu/CSVAL
Related papers
- IDEAL: Influence-Driven Selective Annotations Empower In-Context
Learners in Large Language Models [66.32043210237768]
This paper introduces an influence-driven selective annotation method.
It aims to minimize annotation costs while improving the quality of in-context examples.
Experiments confirm the superiority of the proposed method on various benchmarks.
arXiv Detail & Related papers (2023-10-16T22:53:54Z) - BaSAL: Size-Balanced Warm Start Active Learning for LiDAR Semantic
Segmentation [2.9290232815049926]
Existing active learning methods overlook the severe class imbalance inherent in LiDAR semantic segmentation datasets.
We propose BaSAL, a size-balanced warm start active learning model, based on the observation that each object class has a characteristic size.
Results show that we are able to improve the performance of the initial model by a large margin.
arXiv Detail & Related papers (2023-10-12T05:03:19Z) - DiffusAL: Coupling Active Learning with Graph Diffusion for
Label-Efficient Node Classification [1.0602247913671219]
We introduce a novel active graph learning approach called DiffusAL, showing significant robustness in diverse settings.
Most of our calculations for acquisition and training can be pre-processed, making DiffusAL more efficient compared to approaches combining diverse selection criteria.
Our experiments on various benchmark datasets show that, unlike previous methods, our approach significantly outperforms random selection in 100% of all datasets and labeling budgets tested.
arXiv Detail & Related papers (2023-07-31T20:30:13Z) - COLosSAL: A Benchmark for Cold-start Active Learning for 3D Medical
Image Segmentation [10.80144764655265]
Active learning (AL) is a promising solution for efficient annotation but requires an initial set of labeled samples to start active selection.
This is also known as the cold-start AL, which permits only one chance to request annotations from experts without access to previously annotated data.
We present a benchmark named COSAL by evaluating six cold-start AL strategies on five 3D medical segmentation tasks from the public Medical Decathlon collection.
arXiv Detail & Related papers (2023-07-22T07:19:15Z) - Improving Selective Visual Question Answering by Learning from Your
Peers [74.20167944693424]
Visual Question Answering (VQA) models can have difficulties abstaining from answering when they are wrong.
We propose Learning from Your Peers (LYP) approach for training multimodal selection functions for making abstention decisions.
Our approach uses predictions from models trained on distinct subsets of the training data as targets for optimizing a Selective VQA model.
arXiv Detail & Related papers (2023-06-14T21:22:01Z) - Warm Start Active Learning with Proxy Labels \& Selection via
Semi-Supervised Fine-Tuning [10.086685855244664]
We propose two novel strategies for active learning (AL) specifically for 3D image segmentation.
First, we tackle the cold start problem by proposing a proxy task and then utilizing uncertainty generated from the proxy task to rank the unlabeled data to be annotated.
Second, we craft a two-stage learning framework for each active iteration where the unlabeled data is also used in the second stage as a semi-supervised fine-tuning strategy.
arXiv Detail & Related papers (2022-09-13T20:21:40Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - Using Self-Supervised Pretext Tasks for Active Learning [7.214674613451605]
We propose a novel active learning approach that utilizes self-supervised pretext tasks and a unique data sampler to select data that are both difficult and representative.
The pretext task learner is trained on the unlabeled set, and the unlabeled data are sorted and grouped into batches by their pretext task losses.
In each iteration, the main task model is used to sample the most uncertain data in a batch to be annotated.
arXiv Detail & Related papers (2022-01-19T07:58:06Z) - Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection.
We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn.
We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z) - Confident Coreset for Active Learning in Medical Image Analysis [57.436224561482966]
We propose a novel active learning method, confident coreset, which considers both uncertainty and distribution for effectively selecting informative samples.
By comparative experiments on two medical image analysis tasks, we show that our method outperforms other active learning methods.
arXiv Detail & Related papers (2020-04-05T13:46:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.