Adaptive Batch Sizes for Active Learning A Probabilistic Numerics
Approach
- URL: http://arxiv.org/abs/2306.05843v2
- Date: Wed, 21 Feb 2024 22:07:52 GMT
- Title: Adaptive Batch Sizes for Active Learning A Probabilistic Numerics
Approach
- Authors: Masaki Adachi, Satoshi Hayakawa, Martin J{\o}rgensen, Xingchen Wan, Vu
Nguyen, Harald Oberhauser, Michael A. Osborne
- Abstract summary: Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation.
This fixed approach is inefficient because of a dynamic trade-off between cost and speed.
We propose a novel Probabilistics framework that adaptively changes batch sizes.
- Score: 28.815294991377645
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Active learning parallelization is widely used, but typically relies on
fixing the batch size throughout experimentation. This fixed approach is
inefficient because of a dynamic trade-off between cost and speed -- larger
batches are more costly, smaller batches lead to slower wall-clock run-times --
and the trade-off may change over the run (larger batches are often preferable
earlier). To address this trade-off, we propose a novel Probabilistic Numerics
framework that adaptively changes batch sizes. By framing batch selection as a
quadrature task, our integration-error-aware algorithm facilitates the
automatic tuning of batch sizes to meet predefined quadrature precision
objectives, akin to how typical optimizers terminate based on convergence
thresholds. This approach obviates the necessity for exhaustive searches across
all potential batch sizes. We also extend this to scenarios with constrained
active learning and constrained optimization, interpreting constraint
violations as reductions in the precision requirement, to subsequently adapt
batch construction. Through extensive experiments, we demonstrate that our
approach significantly enhances learning efficiency and flexibility in diverse
Bayesian batch active learning and Bayesian optimization applications.
Related papers
- Unraveling Batch Normalization for Realistic Test-Time Adaptation [22.126177142716188]
This paper delves into the problem of mini-batch degradation.
By unraveling batch normalization, we discover that the inexact target statistics largely stem from the substantially reduced class diversity in batch.
We introduce a straightforward tool, Test-time Exponential Moving Average (TEMA), to bridge the class diversity gap between training and testing batches.
arXiv Detail & Related papers (2023-12-15T01:52:35Z) - BatchGFN: Generative Flow Networks for Batch Active Learning [80.73649229919454]
BatchGFN is a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward.
We show our approach enables principled sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems.
arXiv Detail & Related papers (2023-06-26T20:41:36Z) - Scalable Batch Acquisition for Deep Bayesian Active Learning [70.68403899432198]
In deep active learning, it is important to choose multiple examples to markup at each step.
Existing solutions to this problem, such as BatchBALD, have significant limitations in selecting a large number of examples.
We present the Large BatchBALD algorithm, which aims to achieve comparable quality while being more computationally efficient.
arXiv Detail & Related papers (2023-01-13T11:45:17Z) - Efficiently Controlling Multiple Risks with Pareto Testing [34.83506056862348]
We propose a two-stage process which combines multi-objective optimization with multiple hypothesis testing.
We demonstrate the effectiveness of our approach to reliably accelerate the execution of large-scale Transformer models in natural language processing (NLP) applications.
arXiv Detail & Related papers (2022-10-14T15:54:39Z) - A penalisation method for batch multi-objective Bayesian optimisation
with application in heat exchanger design [3.867356784754811]
We present a batch acquisition function that enables multi-objective Bayesian optimisation methods to efficiently exploit parallel processing resources.
We show that by encouraging batch diversity through penalising evaluations with similar predicted objective values, HIPPO is able to cheaply build large batches of informative points.
arXiv Detail & Related papers (2022-06-27T14:16:54Z) - Batch Active Learning at Scale [39.26441165274027]
Batch active learning, which adaptively issues batched queries to a labeling oracle, is a common approach for addressing this problem.
In this work, we analyze an efficient active learning algorithm, which focuses on the large batch setting.
We show that our sampling method, which combines notions of uncertainty and diversity, easily scales to batch sizes (100K-1M) several orders of magnitude larger than used in previous studies.
arXiv Detail & Related papers (2021-07-29T18:14:05Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Adaptive Learning of the Optimal Batch Size of SGD [52.50880550357175]
We propose a method capable of learning the optimal batch size adaptively throughout its iterations for strongly convex and smooth functions.
Our method does this provably, and in our experiments with synthetic and real data robustly exhibits nearly optimal behaviour.
We generalize our method to several new batch strategies not considered in the literature before, including a sampling suitable for distributed implementations.
arXiv Detail & Related papers (2020-05-03T14:28:32Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.