Alignment is Key for Applying Diffusion Models to Retrosynthesis
- URL: http://arxiv.org/abs/2405.17656v1
- Date: Mon, 27 May 2024 20:57:19 GMT
- Title: Alignment is Key for Applying Diffusion Models to Retrosynthesis
- Authors: Najwa Laabid, Severi Rissanen, Markus Heinonen, Arno Solin, Vikas Garg,
- Abstract summary: Diffusion models are a promising modelling approach, enabling post-hoc conditioning and trading off quality for speed during generation.
We show mathematically that permutation equivariant denoisers severely limit the expressiveness of graph diffusion models and thus their adaptation to retrosynthesis.
Our new denoiser achieves the highest top-$1$ accuracy ($54.7$%) across template-free and template-based methods on USPTO-50k.
- Score: 24.912841472542322
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrosynthesis, the task of identifying precursors for a given molecule, can be naturally framed as a conditional graph generation task. Diffusion models are a particularly promising modelling approach, enabling post-hoc conditioning and trading off quality for speed during generation. We show mathematically that permutation equivariant denoisers severely limit the expressiveness of graph diffusion models and thus their adaptation to retrosynthesis. To address this limitation, we relax the equivariance requirement such that it only applies to aligned permutations of the conditioning and the generated graphs obtained through atom mapping. Our new denoiser achieves the highest top-$1$ accuracy ($54.7$\%) across template-free and template-based methods on USPTO-50k. We also demonstrate the ability for flexible post-training conditioning and good sample quality with small diffusion step counts, highlighting the potential for interactive applications and additional controls for multi-step planning.
Related papers
- Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models [13.053266613831447]
We present a recipe for designing graph foundation models for node-level tasks from first principles.<n>The key ingredient underpinning our study is a systematic investigation of the symmetries that a graph foundation model must respect.<n>We validate our approach through extensive experiments on 29 real-world node classification datasets.
arXiv Detail & Related papers (2025-06-17T08:05:08Z) - Learning-Order Autoregressive Models with Application to Molecular Graph Generation [52.44913282062524]
We introduce a variant of ARM that generates high-dimensional data using a probabilistic ordering that is sequentially inferred from data.<n>We demonstrate experimentally that our method can learn meaningful autoregressive orderings in image and graph generation.
arXiv Detail & Related papers (2025-03-07T23:24:24Z) - Graph Counterfactual Explainable AI via Latent Space Traversal [4.337339380445765]
Counterfactual explanations aim to explain predictions by finding the ''nearest'' in-distribution alternative input.<n>We propose a method to generate counterfactual explanations for any differentiable black-box graph classifier.<n>We empirically validate the approach on three graph datasets, showing that our model is consistently high-performing and more robust than the baselines.
arXiv Detail & Related papers (2025-01-15T15:04:10Z) - Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Chemistry-Inspired Diffusion with Non-Differentiable Guidance [10.573577157257564]
Recent advances in diffusion models have shown remarkable potential in the conditional generation of novel molecules.
We propose a novel approach that leverage domain knowledge from quantum chemistry as a non-differentiable oracle to guide an unconditional diffusion model.
Instead of relying on neural networks, the oracle provides accurate guidance in the form of estimated gradients, allowing the diffusion process to sample from a conditional distribution specified by quantum chemistry.
arXiv Detail & Related papers (2024-10-09T03:10:21Z) - Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization [97.35427957922714]
We present an algorithm named pairwise sample optimization (PSO), which enables the direct fine-tuning of an arbitrary timestep-distilled diffusion model.
PSO introduces additional reference images sampled from the current time-step distilled model, and increases the relative likelihood margin between the training images and reference images.
We show that PSO can directly adapt distilled models to human-preferred generation with both offline and online-generated pairwise preference image data.
arXiv Detail & Related papers (2024-10-04T07:05:16Z) - IFH: a Diffusion Framework for Flexible Design of Graph Generative Models [53.219279193440734]
Graph generative models can be classified into two prominent families: one-shot models, which generate a graph in one go, and sequential models, which generate a graph by successive additions of nodes and edges.
This paper proposes a graph generative model, called Insert-Fill-Halt (IFH), that supports the specification of a sequentiality degree.
arXiv Detail & Related papers (2024-08-23T16:24:40Z) - Advancing Graph Generation through Beta Diffusion [49.49740940068255]
Graph Beta Diffusion (GBD) is a generative model specifically designed to handle the diverse nature of graph data.
We propose a modulation technique that enhances the realism of generated graphs by stabilizing critical graph topology.
arXiv Detail & Related papers (2024-06-13T17:42:57Z) - Generating Graphs via Spectral Diffusion [48.70458395826864]
We present GGSD, a novel graph generative model based on 1) the spectral decomposition of the graph Laplacian matrix and 2) a diffusion process.<n>An extensive set of experiments on both synthetic and real-world graphs demonstrates the strengths of our model against state-of-the-art alternatives.
arXiv Detail & Related papers (2024-02-29T09:26:46Z) - Graph Neural Networks with a Distribution of Parametrized Graphs [27.40566674759208]
We introduce latent variables to parameterize and generate multiple graphs.
We obtain the maximum likelihood estimate of the network parameters in an Expectation-Maximization framework.
arXiv Detail & Related papers (2023-10-25T06:38:24Z) - Graph Mixup with Soft Alignments [49.61520432554505]
We study graph data augmentation by mixup, which has been used successfully on images.
We propose S-Mixup, a simple yet effective mixup method for graph classification by soft alignments.
arXiv Detail & Related papers (2023-06-11T22:04:28Z) - ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion
Trajectories [144.03939123870416]
We propose a novel conditional diffusion model by introducing conditions into the forward process.
We use extra latent space to allocate an exclusive diffusion trajectory for each condition based on some shifting rules.
We formulate our method, which we call textbfShiftDDPMs, and provide a unified point of view on existing related methods.
arXiv Detail & Related papers (2023-02-05T12:48:21Z) - Modular Flows: Differential Molecular Generation [18.41106104201439]
Flows can generate molecules effectively by inverting the encoding process.
Existing flow models require artifactual dequantization or specific node/edge orderings.
We develop continuous normalizing E(3)-equivariant flows, based on a system of node ODEs and a graph PDE.
Our models can be cast as message-passing temporal networks, and result in superlative performance on the tasks of density estimation and molecular generation.
arXiv Detail & Related papers (2022-10-12T09:08:35Z) - DiGress: Discrete Denoising diffusion for graph generation [79.13904438217592]
DiGress is a discrete denoising diffusion model for generating graphs with categorical node and edge attributes.
It achieves state-of-the-art performance on molecular and non-molecular datasets, with up to 3x validity improvement.
It is also the first model to scale to the large GuacaMol dataset containing 1.3M drug-like molecules.
arXiv Detail & Related papers (2022-09-29T12:55:03Z) - Node Copying: A Random Graph Model for Effective Graph Sampling [35.957719744856696]
We introduce the node copying model for constructing a distribution over graphs.
We show the usefulness of the copying model in three tasks.
We employ our proposed model to mitigate the effect of adversarial attacks on the graph topology.
arXiv Detail & Related papers (2022-08-04T04:04:49Z) - GeoDiff: a Geometric Diffusion Model for Molecular Conformation
Generation [102.85440102147267]
We propose a novel generative model named GeoDiff for molecular conformation prediction.
We show that GeoDiff is superior or comparable to existing state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-06T09:47:01Z) - Score-based Generative Modeling of Graphs via the System of Stochastic
Differential Equations [57.15855198512551]
We propose a novel score-based generative model for graphs with a continuous-time framework.
We show that our method is able to generate molecules that lie close to the training distribution yet do not violate the chemical valency rule.
arXiv Detail & Related papers (2022-02-05T08:21:04Z) - Permutation Invariant Graph Generation via Score-Based Generative
Modeling [114.12935776726606]
We propose a permutation invariant approach to modeling graphs, using the recent framework of score-based generative modeling.
In particular, we design a permutation equivariant, multi-channel graph neural network to model the gradient of the data distribution at the input graph.
For graph generation, we find that our learning approach achieves better or comparable results to existing models on benchmark datasets.
arXiv Detail & Related papers (2020-03-02T03:06:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.