Batch Active Learning at Scale
- URL: http://arxiv.org/abs/2107.14263v1
- Date: Thu, 29 Jul 2021 18:14:05 GMT
- Title: Batch Active Learning at Scale
- Authors: Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand
Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar
- Abstract summary: Batch active learning, which adaptively issues batched queries to a labeling oracle, is a common approach for addressing this problem.
In this work, we analyze an efficient active learning algorithm, which focuses on the large batch setting.
We show that our sampling method, which combines notions of uncertainty and diversity, easily scales to batch sizes (100K-1M) several orders of magnitude larger than used in previous studies.
- Score: 39.26441165274027
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The ability to train complex and highly effective models often requires an
abundance of training data, which can easily become a bottleneck in cost, time,
and computational resources. Batch active learning, which adaptively issues
batched queries to a labeling oracle, is a common approach for addressing this
problem. The practical benefits of batch sampling come with the downside of
less adaptivity and the risk of sampling redundant examples within a batch -- a
risk that grows with the batch size. In this work, we analyze an efficient
active learning algorithm, which focuses on the large batch setting. In
particular, we show that our sampling method, which combines notions of
uncertainty and diversity, easily scales to batch sizes (100K-1M) several
orders of magnitude larger than used in previous studies and provides
significant improvements in model training efficiency compared to recent
baselines. Finally, we provide an initial theoretical analysis, proving label
complexity guarantees for a related sampling method, which we show is
approximately equivalent to our sampling method in specific settings.
Related papers
- On Speeding Up Language Model Evaluation [48.51924035873411]
Development of prompt-based methods with Large Language Models (LLMs) requires making numerous decisions.
We propose a novel method to address this challenge.
We show that it can identify the top-performing method using only 5-15% of the typically needed resources.
arXiv Detail & Related papers (2024-07-08T17:48:42Z) - BatchGFN: Generative Flow Networks for Batch Active Learning [80.73649229919454]
BatchGFN is a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward.
We show our approach enables principled sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems.
arXiv Detail & Related papers (2023-06-26T20:41:36Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Scalable Batch Acquisition for Deep Bayesian Active Learning [70.68403899432198]
In deep active learning, it is important to choose multiple examples to markup at each step.
Existing solutions to this problem, such as BatchBALD, have significant limitations in selecting a large number of examples.
We present the Large BatchBALD algorithm, which aims to achieve comparable quality while being more computationally efficient.
arXiv Detail & Related papers (2023-01-13T11:45:17Z) - Achieving Minimax Rates in Pool-Based Batch Active Learning [26.12124106759262]
We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle.
In this paper we propose a solution which requires a careful trade off between the informativeness of the queried points and their diversity.
arXiv Detail & Related papers (2022-02-11T04:55:45Z) - One Backward from Ten Forward, Subsampling for Large-Scale Deep Learning [35.0157090322113]
Large-scale machine learning systems are often continuously trained with enormous data from production environments.
The sheer volume of streaming data poses a significant challenge to real-time training subsystems and ad-hoc sampling is the standard practice.
We propose to record a constant amount of information per instance from these forward passes. The extra information measurably improves the selection of which data instances should participate in forward and backward passes.
arXiv Detail & Related papers (2021-04-27T11:29:02Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.