Reliable algorithm selection for machine learning-guided design
- URL: http://arxiv.org/abs/2503.20767v1
- Date: Wed, 26 Mar 2025 17:52:19 GMT
- Title: Reliable algorithm selection for machine learning-guided design
- Authors: Clara Fannjiang, Ji Won Park,
- Abstract summary: This paper proposes a method for design algorithm selection.<n>It aims to select design algorithms that will produce a distribution of design labels satisfying a user-specified success criterion.<n>We demonstrate the method's effectiveness in simulated protein and RNA design tasks.
- Score: 2.9158689853305693
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Algorithms for machine learning-guided design, or design algorithms, use machine learning-based predictions to propose novel objects with desired property values. Given a new design task -- for example, to design novel proteins with high binding affinity to a therapeutic target -- one must choose a design algorithm and specify any hyperparameters and predictive and/or generative models involved. How can these decisions be made such that the resulting designs are successful? This paper proposes a method for design algorithm selection, which aims to select design algorithms that will produce a distribution of design labels satisfying a user-specified success criterion -- for example, that at least ten percent of designs' labels exceed a threshold. It does so by combining designs' predicted property values with held-out labeled data to reliably forecast characteristics of the label distributions produced by different design algorithms, building upon techniques from prediction-powered inference. The method is guaranteed with high probability to return design algorithms that yield successful label distributions (or the null set if none exist), if the density ratios between the design and labeled data distributions are known. We demonstrate the method's effectiveness in simulated protein and RNA design tasks, in settings with either known or estimated density ratios.
Related papers
- An Uncertainty-aware Deep Learning Framework-based Robust Design Optimization of Metamaterial Units [14.660705962826718]
We propose a novel uncertainty-aware deep learning framework-based robust design approach for the design of metamaterial units.
We demonstrate that the proposed design approach is capable of designing high-performance metamaterial units with high reliability.
arXiv Detail & Related papers (2024-07-19T22:21:27Z) - Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient [52.2669490431145]
PropEn is inspired by'matching', which enables implicit guidance without training a discriminator.
We show that training with a matched dataset approximates the gradient of the property of interest while remaining within the data distribution.
arXiv Detail & Related papers (2024-05-28T11:30:19Z) - Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization.
We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons.
Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z) - Compositional Generative Inverse Design [69.22782875567547]
Inverse design, where we seek to design input variables in order to optimize an underlying objective function, is an important problem.
We show that by instead optimizing over the learned energy function captured by the diffusion model, we can avoid such adversarial examples.
In an N-body interaction task and a challenging 2D multi-airfoil design task, we demonstrate that by composing the learned diffusion model at test time, our method allows us to design initial states and boundary shapes.
arXiv Detail & Related papers (2024-01-24T01:33:39Z) - Beyond the training set: an intuitive method for detecting distribution
shift in model-based optimization [0.4188114563181614]
A common scenario involves using a fixed training set to train models, with the goal of designing new samples that outperform those present in the training data.
A major challenge in this setting is distribution shift, where the distributions of training and design samples are different.
We propose a straightforward method for design practitioners that detects distribution shifts.
arXiv Detail & Related papers (2023-11-09T13:44:28Z) - Efficient Automatic Machine Learning via Design Graphs [72.85976749396745]
We propose FALCON, an efficient sample-based method to search for the optimal model design.
FALCON features 1) a task-agnostic module, which performs message passing on the design graph via a Graph Neural Network (GNN), and 2) a task-specific module, which conducts label propagation of the known model performance information.
We empirically show that FALCON can efficiently obtain the well-performing designs for each task using only 30 explored nodes.
arXiv Detail & Related papers (2022-10-21T21:25:59Z) - Targeted Adaptive Design [0.0]
Modern manufacturing and advanced materials design often require searches of relatively high-dimensional process control parameter spaces.
We describe targeted adaptive design (TAD), a new algorithm that performs this sampling task efficiently.
TAD embodies the exploration-exploitation tension in a manner that recalls, but is essentially different from, Bayesian optimization and optimal experimental design.
arXiv Detail & Related papers (2022-05-27T19:29:24Z) - An adaptive artificial neural network-based generative design method for
layout designs [17.377351418260577]
An adaptive artificial neural network-based generative design approach is proposed and developed.
A novel adaptive learning and optimization strategy is proposed, which allows the design space to be effectively explored.
The performance of the proposed design method is demonstrated on two heat source layout design problems.
arXiv Detail & Related papers (2021-01-29T05:32:17Z) - An AI-Assisted Design Method for Topology Optimization Without
Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way.
Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z) - Application of Deep Learning in Generating Desired Design Options:
Experiments Using Synthetic Training Dataset [5.564299196293697]
This study applies a method using Deep Learning (DL) algorithms towards generating demanded design options.
An object recognition problem is investigated to initially predict the label of unseen sample images based on training dataset consisting of different types of synthetic 2D shapes.
In the next step, the algorithm is trained to generate a window/wall pattern for desired light/shadow performance based on the spatial daylight autonomy (sDA) metrics.
arXiv Detail & Related papers (2019-12-28T01:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.