A Simple Baseline for Low-Budget Active Learning
- URL: http://arxiv.org/abs/2110.12033v1
- Date: Fri, 22 Oct 2021 19:36:56 GMT
- Title: A Simple Baseline for Low-Budget Active Learning
- Authors: Kossar Pourahmadi, Parsa Nooralinejad, Hamed Pirsiavash
- Abstract summary: We show that a simple k-means clustering algorithm can outperform state-of-the-art active learning methods on low budgets.
This method can be used as a simple baseline for low-budget active learning on image classification.
- Score: 15.54250249254414
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Active learning focuses on choosing a subset of unlabeled data to be labeled.
However, most such methods assume that a large subset of the data can be
annotated. We are interested in low-budget active learning where only a small
subset (e.g., 0.2% of ImageNet) can be annotated. Instead of proposing a new
query strategy to iteratively sample batches of unlabeled data given an initial
pool, we learn rich features by an off-the-shelf self-supervised learning
method only once and then study the effectiveness of different sampling
strategies given a low budget on a variety of datasets as well as ImageNet
dataset. We show that although the state-of-the-art active learning methods
work well given a large budget of data labeling, a simple k-means clustering
algorithm can outperform them on low budgets. We believe this method can be
used as a simple baseline for low-budget active learning on image
classification. Code is available at:
https://github.com/UCDvision/low-budget-al
Related papers
- Zero-shot Active Learning Using Self Supervised Learning [11.28415437676582]
We propose a new Active Learning approach which is model agnostic as well as one doesn't require an iterative process.
We aim to leverage self-supervised learnt features for the task of Active Learning.
arXiv Detail & Related papers (2024-01-03T11:49:07Z) - Deep Active Learning with Contrastive Learning Under Realistic Data Pool
Assumptions [2.578242050187029]
Active learning aims to identify the most informative data from an unlabeled data pool that enables a model to reach the desired accuracy rapidly.
Most existing active learning methods have been evaluated in an ideal setting where only samples relevant to the target task exist in an unlabeled data pool.
We introduce new active learning benchmarks that include ambiguous, task-irrelevant out-of-distribution as well as in-distribution samples.
arXiv Detail & Related papers (2023-03-25T10:46:10Z) - MoBYv2AL: Self-supervised Active Learning for Image Classification [57.4372176671293]
We present MoBYv2AL, a novel self-supervised active learning framework for image classification.
Our contribution lies in lifting MoBY, one of the most successful self-supervised learning algorithms, to the AL pipeline.
We achieve state-of-the-art results when compared to recent AL methods.
arXiv Detail & Related papers (2023-01-04T10:52:02Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Budget-aware Few-shot Learning via Graph Convolutional Network [56.41899553037247]
This paper tackles the problem of few-shot learning, which aims to learn new visual concepts from a few examples.
A common problem setting in few-shot classification assumes random sampling strategy in acquiring data labels.
We introduce a new budget-aware few-shot learning problem that aims to learn novel object categories.
arXiv Detail & Related papers (2022-01-07T02:46:35Z) - Active Learning at the ImageNet Scale [43.595076693347835]
In this work, we study a combination of active learning (AL) and pretraining (SSP) on ImageNet.
We find that performance on small toy datasets is not representative of performance on ImageNet due to the class imbalanced samples selected by an active learner.
We propose Balanced Selection (BASE), a simple, scalable AL algorithm that outperforms random sampling consistently.
arXiv Detail & Related papers (2021-11-25T02:48:51Z) - Low Budget Active Learning via Wasserstein Distance: An Integer
Programming Approach [81.19737119343438]
Active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label.
We propose a new integer optimization problem for selecting a core set that minimizes the discrete Wasserstein distance from the unlabeled pool.
Our strategy requires high-quality latent features which we obtain by unsupervised learning on the unlabeled pool.
arXiv Detail & Related papers (2021-06-05T21:25:03Z) - How to distribute data across tasks for meta-learning? [59.608652082495624]
We show that the optimal number of data points per task depends on the budget, but it converges to a unique constant value for large budgets.
Our results suggest a simple and efficient procedure for data collection.
arXiv Detail & Related papers (2021-03-15T15:38:47Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z) - Learning to Rank for Active Learning: A Listwise Approach [36.72443179449176]
Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications.
In this work, we rethink the structure of the loss prediction module, using a simple but effective listwise approach.
Experimental results on four datasets demonstrate that our method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks.
arXiv Detail & Related papers (2020-07-31T21:05:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.