Multi-Fidelity Active Learning with GFlowNets
- URL: http://arxiv.org/abs/2306.11715v1
- Date: Tue, 20 Jun 2023 17:43:42 GMT
- Title: Multi-Fidelity Active Learning with GFlowNets
- Authors: Alex Hernandez-Garcia and Nikita Saxena and Moksh Jain and Cheng-Hao
Liu and Yoshua Bengio
- Abstract summary: We propose the use of GFlowNets for multi-fidelity active learning, where multiple approximations of the black-box function are available at lower fidelity and cost.
Our results show that multi-fidelity active learning with GFlowNets can efficiently leverage the availability of multiple oracles with different costs and fidelities.
- Score: 77.01923839831899
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the last decades, the capacity to generate large amounts of data in
science and engineering applications has been growing steadily. Meanwhile, the
progress in machine learning has turned it into a suitable tool to process and
utilise the available data. Nonetheless, many relevant scientific and
engineering problems present challenges where current machine learning methods
cannot yet efficiently leverage the available data and resources. For example,
in scientific discovery, we are often faced with the problem of exploring very
large, high-dimensional spaces, where querying a high fidelity, black-box
objective function is very expensive. Progress in machine learning methods that
can efficiently tackle such problems would help accelerate currently crucial
areas such as drug and materials discovery. In this paper, we propose the use
of GFlowNets for multi-fidelity active learning, where multiple approximations
of the black-box function are available at lower fidelity and cost. GFlowNets
are recently proposed methods for amortised probabilistic inference that have
proven efficient for exploring large, high-dimensional spaces and can hence be
practical in the multi-fidelity setting too. Here, we describe our algorithm
for multi-fidelity active learning with GFlowNets and evaluate its performance
in both well-studied synthetic tasks and practically relevant applications of
molecular discovery. Our results show that multi-fidelity active learning with
GFlowNets can efficiently leverage the availability of multiple oracles with
different costs and fidelities to accelerate scientific discovery and
engineering design.
Related papers
- Towards Lightweight Data Integration using Multi-workflow Provenance and
Data Observability [0.2517763905487249]
Integrated data analysis plays a crucial role in scientific discovery, especially in the current AI era.
We propose MIDA: an approach for lightweight runtime Multi-workflow Integrated Data Analysis.
We show near-zero overhead running up to 100,000 tasks on 1,680 CPU cores on the Summit supercomputer.
arXiv Detail & Related papers (2023-08-17T14:20:29Z) - Let the Flows Tell: Solving Graph Combinatorial Optimization Problems
with GFlowNets [86.43523688236077]
Combinatorial optimization (CO) problems are often NP-hard and out of reach for exact algorithms.
GFlowNets have emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially.
In this paper, we design Markov decision processes (MDPs) for different problems and propose to train conditional GFlowNets to sample from the solution space.
arXiv Detail & Related papers (2023-05-26T15:13:09Z) - GFlowNets for AI-Driven Scientific Discovery [74.27219800878304]
We present a new probabilistic machine learning framework called GFlowNets.
GFlowNets can be applied in the modeling, hypotheses generation and experimental design stages of the experimental science loop.
We argue that GFlowNets can become a valuable tool for AI-driven scientific discovery.
arXiv Detail & Related papers (2023-02-01T17:29:43Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - Understanding the World Through Action [91.3755431537592]
I will argue that a general, principled, and powerful framework for utilizing unlabeled data can be derived from reinforcement learning.
I will discuss how such a procedure is more closely aligned with potential downstream tasks.
arXiv Detail & Related papers (2021-10-24T22:33:52Z) - Deep Multi-Fidelity Active Learning of High-dimensional Outputs [17.370056935194786]
We develop a deep neural network-based multi-fidelity model for learning with high-dimensional outputs.
We then propose a mutual information-based acquisition function that extends the predictive entropy principle.
We show the advantage of our method in several applications of computational physics and engineering design.
arXiv Detail & Related papers (2020-12-02T00:02:31Z) - Active Importance Sampling for Variational Objectives Dominated by Rare
Events: Consequences for Optimization and Generalization [12.617078020344618]
We introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions dominated by rare events.
We show that importance sampling reduces the variance of the solution to a learning problem, suggesting benefits for generalization.
Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimensional and rare data.
arXiv Detail & Related papers (2020-08-11T23:38:09Z) - Bayesian active learning for production, a systematic study and a
reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.