Efficient 3D Molecular Generation with Flow Matching and Scale Optimal Transport
- URL: http://arxiv.org/abs/2406.07266v2
- Date: Tue, 25 Jun 2024 11:42:09 GMT
- Title: Efficient 3D Molecular Generation with Flow Matching and Scale Optimal Transport
- Authors: Ross Irwin, Alessandro Tibo, Jon Paul Janet, Simon Olsson,
- Abstract summary: Semla is a scalable E(3)-equivariant message passing architecture.
SemlaFlow is trained using flow matching along with scale optimal transport.
Our model produces state-of-the-art results on benchmark datasets with just 100 sampling steps.
- Score: 43.56824843205882
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Generative models for 3D drug design have gained prominence recently for their potential to design ligands directly within protein pockets. Current approaches, however, often suffer from very slow sampling times or generate molecules with poor chemical validity. Addressing these limitations, we propose Semla, a scalable E(3)-equivariant message passing architecture. We further introduce a molecular generation model, SemlaFlow, which is trained using flow matching along with scale optimal transport, a novel extension of equivariant optimal transport. Our model produces state-of-the-art results on benchmark datasets with just 100 sampling steps. Crucially, SemlaFlow samples high quality molecules with as few as 20 steps, corresponding to a two order-of-magnitude speed-up compared to state-of-the-art, without sacrificing performance. Furthermore, we highlight limitations of current evaluation methods for 3D generation and propose new benchmark metrics for unconditional molecular generators. Finally, using these new metrics, we compare our model's ability to generate high quality samples against current approaches and further demonstrate SemlaFlow's strong performance.
Related papers
- Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation [0.0]
Flow matching is a recently proposed generative modeling framework that generalizes diffusion models.
We extend the flow matching framework to categorical data by constructing flows that are constrained to exist on a continuous representation of categorical data known as the probability simplex.
We find that, in practice, a simpler approach that makes no accommodations for the categorical nature of the data yields equivalent or superior performance.
arXiv Detail & Related papers (2024-04-30T17:37:21Z) - Unified Generative Modeling of 3D Molecules via Bayesian Flow Networks [19.351562908683334]
GeoBFN naturally fits molecule geometry by modeling diverse modalities in the differentiable parameter space of distributions.
We demonstrate that GeoBFN achieves state-of-the-art performance on multiple 3D molecule generation benchmarks in terms of generation quality.
GeoBFN can also conduct sampling with any number of steps to reach an optimal trade-off between efficiency and quality.
arXiv Detail & Related papers (2024-03-17T08:40:06Z) - Equivariant Flow Matching with Hybrid Probability Transport [69.11915545210393]
Diffusion Models (DMs) have demonstrated effectiveness in generating feature-rich geometries.
DMs typically suffer from unstable probability dynamics with inefficient sampling speed.
We introduce geometric flow matching, which enjoys the advantages of both equivariant modeling and stabilized probability dynamics.
arXiv Detail & Related papers (2023-12-12T11:13:13Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Latent Consistency Models: Synthesizing High-Resolution Images with
Few-Step Inference [60.32804641276217]
We propose Latent Consistency Models (LCMs), enabling swift inference with minimal steps on any pre-trained LDMs.
A high-quality 768 x 768 24-step LCM takes only 32 A100 GPU hours for training.
We also introduce Latent Consistency Fine-tuning (LCF), a novel method that is tailored for fine-tuning LCMs on customized image datasets.
arXiv Detail & Related papers (2023-10-06T17:11:58Z) - SE(3)-Stochastic Flow Matching for Protein Backbone Generation [54.951832422425454]
We introduce FoldFlow, a series of novel generative models of increasing modeling power based on the flow-matching paradigm over $3mathrmD$ rigid motions.
Our family of FoldFlowgenerative models offers several advantages over previous approaches to the generative modeling of proteins.
arXiv Detail & Related papers (2023-10-03T19:24:24Z) - 3D Equivariant Diffusion for Target-Aware Molecule Generation and
Affinity Prediction [9.67574543046801]
The inclusion of 3D structures during targeted drug design shows superior performance to other target-free models.
We develop a 3D equivariant diffusion model to solve the above challenges.
Our model could generate molecules with more realistic 3D structures and better affinities towards the protein targets, and improve binding affinity ranking and prediction without retraining.
arXiv Detail & Related papers (2023-03-06T23:01:43Z) - Manifold Interpolating Optimal-Transport Flows for Trajectory Inference [64.94020639760026]
We present a method called Manifold Interpolating Optimal-Transport Flow (MIOFlow)
MIOFlow learns, continuous population dynamics from static snapshot samples taken at sporadic timepoints.
We evaluate our method on simulated data with bifurcations and merges, as well as scRNA-seq data from embryoid body differentiation, and acute myeloid leukemia treatment.
arXiv Detail & Related papers (2022-06-29T22:19:03Z) - FastFlows: Flow-Based Models for Molecular Graph Generation [4.9252608053969675]
FastFlows generates thousands of chemically valid molecules in seconds.
Our model is significantly simpler and easier to train than autoregressive molecular generative models.
arXiv Detail & Related papers (2022-01-28T21:08:31Z) - Augmented Normalizing Flows: Bridging the Gap Between Generative Flows
and Latent Variable Models [11.206144910991481]
We propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood.
We demonstrate state-of-the-art performance on standard benchmarks of flow-based generative modeling.
arXiv Detail & Related papers (2020-02-17T17:45:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.