Beam Enumeration: Probabilistic Explainability For Sample Efficient
Self-conditioned Molecular Design
- URL: http://arxiv.org/abs/2309.13957v2
- Date: Sun, 3 Mar 2024 16:23:00 GMT
- Title: Beam Enumeration: Probabilistic Explainability For Sample Efficient
Self-conditioned Molecular Design
- Authors: Jeff Guo, Philippe Schwaller
- Abstract summary: Generative molecular design has moved from proof-of-concept to real-world applicability.
Key challenges in explainability and sample efficiency present opportunities to enhance generative design.
Beamion is generally applicable to any language-based molecular generative model.
- Score: 0.4769602527256662
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative molecular design has moved from proof-of-concept to real-world
applicability, as marked by the surge in very recent papers reporting
experimental validation. Key challenges in explainability and sample efficiency
present opportunities to enhance generative design to directly optimize
expensive high-fidelity oracles and provide actionable insights to domain
experts. Here, we propose Beam Enumeration to exhaustively enumerate the most
probable sub-sequences from language-based molecular generative models and show
that molecular substructures can be extracted. When coupled with reinforcement
learning, extracted substructures become meaningful, providing a source of
explainability and improving sample efficiency through self-conditioned
generation. Beam Enumeration is generally applicable to any language-based
molecular generative model and notably further improves the performance of the
recently reported Augmented Memory algorithm, which achieved the new
state-of-the-art on the Practical Molecular Optimization benchmark for sample
efficiency. The combined algorithm generates more high reward molecules and
faster, given a fixed oracle budget. Beam Enumeration shows that improvements
to explainability and sample efficiency for molecular design can be made
synergistic.
Related papers
- Pathway-Guided Optimization of Deep Generative Molecular Design Models for Cancer Therapy [1.8210200978176423]
The junction tree variational autoencoder (JTVAE) has been shown to be an efficient generative model.
We show how a pharmacodynamic model, assessing the therapeutic efficacy of a drug-like small molecule, can be incorporated for effective latent space optimization.
arXiv Detail & Related papers (2024-11-05T19:20:30Z) - Cliqueformer: Model-Based Optimization with Structured Transformers [102.55764949282906]
Large neural networks excel at prediction tasks, but their application to design problems, such as protein engineering or materials discovery, requires solving offline model-based optimization (MBO) problems.
We present Cliqueformer, a transformer-based architecture that learns the black-box function's structure through functional graphical models (FGM)
Across various domains, including chemical and genetic design tasks, Cliqueformer demonstrates superior performance compared to existing methods.
arXiv Detail & Related papers (2024-10-17T00:35:47Z) - MING: A Functional Approach to Learning Molecular Generative Models [46.189683355768736]
This paper introduces a novel paradigm for learning molecule generative models based on functional representations.
We propose Molecular Implicit Neural Generation (MING), a diffusion-based model that learns molecular distributions in the function space.
arXiv Detail & Related papers (2024-10-16T13:02:02Z) - Chemistry-Inspired Diffusion with Non-Differentiable Guidance [10.573577157257564]
Recent advances in diffusion models have shown remarkable potential in the conditional generation of novel molecules.
We propose a novel approach that leverage domain knowledge from quantum chemistry as a non-differentiable oracle to guide an unconditional diffusion model.
Instead of relying on neural networks, the oracle provides accurate guidance in the form of estimated gradients, allowing the diffusion process to sample from a conditional distribution specified by quantum chemistry.
arXiv Detail & Related papers (2024-10-09T03:10:21Z) - Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties.
It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization [49.85944390503957]
DecompOpt is a structure-based molecular optimization method based on a controllable and diffusion model.
We show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines.
arXiv Detail & Related papers (2024-03-07T02:53:40Z) - Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling.
We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models.
NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z) - Augmented Memory: Capitalizing on Experience Replay to Accelerate De
Novo Molecular Design [0.0]
Molecular generative models should learn to satisfy a desired objective under minimal oracle evaluations.
We propose a novel algorithm called Augmented Memory that combines data augmentation with experience replay.
We show that scores obtained from oracle calls can be reused to update the model multiple times.
arXiv Detail & Related papers (2023-05-10T14:00:50Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Learning Neural Generative Dynamics for Molecular Conformation
Generation [89.03173504444415]
We study how to generate molecule conformations (textiti.e., 3D structures) from a molecular graph.
We propose a novel probabilistic framework to generate valid and diverse conformations given a molecular graph.
arXiv Detail & Related papers (2021-02-20T03:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.