Speeding Up BatchBALD: A k-BALD Family of Approximations for Active
Learning
- URL: http://arxiv.org/abs/2301.09490v1
- Date: Mon, 23 Jan 2023 15:38:58 GMT
- Title: Speeding Up BatchBALD: A k-BALD Family of Approximations for Active
Learning
- Authors: Andreas Kirsch
- Abstract summary: BatchBALD is a technique for training machine learning models with limited labeled data.
In this paper, we propose a new approximation, k-BALD, which uses k-wise mutual information terms to approximate BatchBALD.
Results on the MNIST dataset show that k-BALD is significantly faster than BatchBALD while maintaining similar performance.
- Score: 1.52292571922932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning is a powerful method for training machine learning models
with limited labeled data. One commonly used technique for active learning is
BatchBALD, which uses Bayesian neural networks to find the most informative
points to label in a pool set. However, BatchBALD can be very slow to compute,
especially for larger datasets. In this paper, we propose a new approximation,
k-BALD, which uses k-wise mutual information terms to approximate BatchBALD,
making it much less expensive to compute. Results on the MNIST dataset show
that k-BALD is significantly faster than BatchBALD while maintaining similar
performance. Additionally, we also propose a dynamic approach for choosing k
based on the quality of the approximation, making it more efficient for larger
datasets.
Related papers
- Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification [1.8567173419246403]
Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks.
BAIT, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets.
This paper introduces two methods to enhance BAIT's computational efficiency and scalability.
arXiv Detail & Related papers (2024-04-13T12:09:37Z) - A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data.
We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z) - Novel Batch Active Learning Approach and Its Application to Synthetic
Aperture Radar Datasets [7.381841249558068]
Recent gains have been made using sequential active learning for synthetic aperture radar (SAR) data arXiv:2204.00005.
We developed a novel, two-part approach for batch active learning: Dijkstra's Annulus Core-Set (DAC) for core-set generation and LocalMax for batch sampling.
The batch active learning process that combines DAC and LocalMax achieves nearly identical accuracy as sequential active learning but is more efficient, proportional to the batch size.
arXiv Detail & Related papers (2023-07-19T23:25:21Z) - BatchGFN: Generative Flow Networks for Batch Active Learning [80.73649229919454]
BatchGFN is a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward.
We show our approach enables principled sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems.
arXiv Detail & Related papers (2023-06-26T20:41:36Z) - Scalable Batch Acquisition for Deep Bayesian Active Learning [70.68403899432198]
In deep active learning, it is important to choose multiple examples to markup at each step.
Existing solutions to this problem, such as BatchBALD, have significant limitations in selecting a large number of examples.
We present the Large BatchBALD algorithm, which aims to achieve comparable quality while being more computationally efficient.
arXiv Detail & Related papers (2023-01-13T11:45:17Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Benchmarking Learning Efficiency in Deep Reservoir Computing [23.753943709362794]
We introduce a benchmark of increasingly difficult tasks together with a data efficiency metric to measure how quickly machine learning models learn from training data.
We compare the learning speed of some established sequential supervised models, such as RNNs, LSTMs, or Transformers, with relatively less known alternative models based on reservoir computing.
arXiv Detail & Related papers (2022-09-29T08:16:52Z) - Data Shapley Valuation for Efficient Batch Active Learning [21.76249748709411]
Active Data Shapley (ADS) is a filtering layer for batch active learning.
We show that ADS is particularly effective when the pool of unlabeled data exhibits real-world caveats.
arXiv Detail & Related papers (2021-04-16T18:53:42Z) - Continual Learning using a Bayesian Nonparametric Dictionary of Weight
Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings.
We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity.
We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.