Related papers: BatchGFN: Generative Flow Networks for Batch Active Learning

BatchGFN: Generative Flow Networks for Batch Active Learning

URL: http://arxiv.org/abs/2306.15058v1
Date: Mon, 26 Jun 2023 20:41:36 GMT
Title: BatchGFN: Generative Flow Networks for Batch Active Learning
Authors: Shreshth A. Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal
Abstract summary: BatchGFN is a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. We show our approach enables principled sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems.
Score: 80.73649229919454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active learning in a principled way. We show our approach enables sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems. This alleviates the computational complexity of batch-aware algorithms and removes the need for greedy approximations to find maximizers for the batch reward. We also present early results for amortizing training across acquisition steps, which will enable scaling to real-world tasks.

Related papers

Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Adaptive Batch Sizes for Active Learning A Probabilistic Numerics Approach [28.815294991377645]
Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off between cost and speed. We propose a novel Probabilistics framework that adaptively changes batch sizes.
arXiv Detail & Related papers (2023-06-09T12:17:18Z)
Distributional GFlowNets with Quantile Flows [73.73721901056662]
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a policy for generating complex structure through a series of decision-making steps. In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training. Our proposed textitquantile matching GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty.
arXiv Detail & Related papers (2023-02-11T22:06:17Z)
Scalable Batch Acquisition for Deep Bayesian Active Learning [70.68403899432198]
In deep active learning, it is important to choose multiple examples to markup at each step. Existing solutions to this problem, such as BatchBALD, have significant limitations in selecting a large number of examples. We present the Large BatchBALD algorithm, which aims to achieve comparable quality while being more computationally efficient.
arXiv Detail & Related papers (2023-01-13T11:45:17Z)
Batch Active Learning from the Perspective of Sparse Approximation [12.51958241746014]
Active learning enables efficient model training by leveraging interactions between machine learning agents and human annotators. We study and propose a novel framework that formulates batch active learning from the sparse approximation's perspective. Our active learning method aims to find an informative subset from the unlabeled data pool such that the corresponding training loss function approximates its full data pool counterpart.
arXiv Detail & Related papers (2022-11-01T03:20:28Z)
Achieving Minimax Rates in Pool-Based Batch Active Learning [26.12124106759262]
We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle. In this paper we propose a solution which requires a careful trade off between the informativeness of the queried points and their diversity.
arXiv Detail & Related papers (2022-02-11T04:55:45Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
Batch Active Learning at Scale [39.26441165274027]
Batch active learning, which adaptively issues batched queries to a labeling oracle, is a common approach for addressing this problem. In this work, we analyze an efficient active learning algorithm, which focuses on the large batch setting. We show that our sampling method, which combines notions of uncertainty and diversity, easily scales to batch sizes (100K-1M) several orders of magnitude larger than used in previous studies.
arXiv Detail & Related papers (2021-07-29T18:14:05Z)
Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning [48.19646855997791]
We examine a simple strategy for adapting well-known single-point acquisition functions to allow batch active learning. This strategy can perform just as well as compute-intensive state-of-the-art batch acquisition functions, like BatchBALD or BADGE, while using orders of magnitude less compute. In addition to providing a practical option for machine learning practitioners, the surprising success of the proposed method in a wide range of experimental settings raises a difficult question for the field.
arXiv Detail & Related papers (2021-06-22T21:07:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.