Electron flow matching for generative reaction mechanism prediction obeying conservation laws
- URL: http://arxiv.org/abs/2502.12979v1
- Date: Tue, 18 Feb 2025 16:01:17 GMT
- Title: Electron flow matching for generative reaction mechanism prediction obeying conservation laws
- Authors: Joonyoung F. Joung, Mun Hong Fong, Nicholas Casetti, Jordan P. Liles, Ne S. Dassanayake, Connor W. Coley,
- Abstract summary: This work recasts the problem of reaction prediction as a problem of electron redistribution using the modern deep generative framework of flow matching.
Our model, FlowER, overcomes limitations by enforcing exact mass conservation, thereby resolving hallucinatory failure modes.
FlowER additionally enables estimation of thermodynamic or kinetic feasibility and manifests a degree of chemical intuition in reaction prediction tasks.
- Score: 8.136277960071032
- License:
- Abstract: Central to our understanding of chemical reactivity is the principle of mass conservation, which is fundamental for ensuring physical consistency, balancing equations, and guiding reaction design. However, data-driven computational models for tasks such as reaction product prediction rarely abide by this most basic constraint. In this work, we recast the problem of reaction prediction as a problem of electron redistribution using the modern deep generative framework of flow matching. Our model, FlowER, overcomes limitations inherent in previous approaches by enforcing exact mass conservation, thereby resolving hallucinatory failure modes, recovering mechanistic reaction sequences for unseen substrate scaffolds, and generalizing effectively to out-of-domain reaction classes with extremely data-efficient fine-tuning. FlowER additionally enables estimation of thermodynamic or kinetic feasibility and manifests a degree of chemical intuition in reaction prediction tasks. This inherently interpretable framework represents a significant step in bridging the gap between predictive accuracy and mechanistic understanding in data-driven reaction outcome prediction.
Related papers
- Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
RAlign is a novel chemical reaction representation learning model for various organic reaction-related tasks.
By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction.
We introduce a reaction-center-aware attention mechanism that enables the model to concentrate on key functional groups.
arXiv Detail & Related papers (2024-11-26T17:41:44Z) - ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots [4.362338454684645]
We develop an interpretable attention-based GNN that achieved near-unity and 96% accuracy for reaction step classification.
Our model adeptly identifies key atom(s) even from out-of-distribution classes.
This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.
arXiv Detail & Related papers (2024-07-14T05:53:18Z) - Beyond Major Product Prediction: Reproducing Reaction Mechanisms with
Machine Learning Models Trained on a Large-Scale Mechanistic Dataset [10.968137261042715]
Mechanistic understanding of organic reactions can facilitate reaction development, impurity prediction, and in principle, reaction discovery.
While several machine learning models have sought to address the task of predicting reaction products, their extension to predicting reaction mechanisms has been impeded by the lack of a corresponding mechanistic dataset.
We construct such a dataset by imputing intermediates between experimentally reported reactants and products using expert reaction templates and train several machine learning models on the resulting dataset of 5,184,184 elementary steps.
arXiv Detail & Related papers (2024-03-07T15:26:23Z) - Towards out-of-distribution generalizable predictions of chemical
kinetics properties [61.15970601264632]
Out-Of-Distribution (OOD) kinetic property prediction is required to be generalizable.
In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism)
We create comprehensive datasets to benchmark the state-of-the-art ML approaches for reaction prediction in the OOD setting and the state-of-the-art graph OOD methods in kinetics property prediction problems.
arXiv Detail & Related papers (2023-10-04T20:36:41Z) - Doubly Stochastic Graph-based Non-autoregressive Reaction Prediction [59.41636061300571]
We propose a new framework called that combines two doubly self-attention mappings to obtain electron redistribution predictions.
We show that our approach consistently improves the predictive performance of non-autoregressive models.
arXiv Detail & Related papers (2023-06-05T14:15:39Z) - Discovering Latent Causal Variables via Mechanism Sparsity: A New
Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application.
We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z) - Non-Autoregressive Electron Redistribution Modeling for Reaction
Prediction [26.007965383304864]
We devise a non-autoregressive learning paradigm that predicts reaction in one shot.
We formulate a reaction as an arbitrary electron flow and predict it with a novel multi-pointer decoding network.
Experiments on the USPTO-MIT dataset show that our approach has established a new state-of-the-art top-1 accuracy.
arXiv Detail & Related papers (2021-06-08T16:39:08Z) - Non-autoregressive electron flow generation for reaction prediction [15.98143959075733]
We devise a novel decoder that avoids such sequential generating and predicts the reaction in a Non-Autoregressive manner.
Inspired by physical-chemistry insights, we represent edge edits in a molecule graph as electron flows, which can then be predicted in parallel.
Our model achieves both an order of magnitude lower inference latency, with state-of-the-art top-1 accuracy and comparable performance on Top-K sampling.
arXiv Detail & Related papers (2020-12-16T10:01:26Z) - Physics-Informed Gaussian Process Regression for Probabilistic States
Estimation and Forecasting in Power Grids [67.72249211312723]
Real-time state estimation and forecasting is critical for efficient operation of power grids.
PhI-GPR is presented and used for forecasting and estimating the phase angle, angular speed, and wind mechanical power of a three-generator power grid system.
We demonstrate that the proposed PhI-GPR method can accurately forecast and estimate both observed and unobserved states.
arXiv Detail & Related papers (2020-10-09T14:18:31Z) - Retrosynthesis Prediction with Conditional Graph Logic Network [118.70437805407728]
Computer-aided retrosynthesis is finding renewed interest from both chemistry and computer science communities.
We propose a new approach to this task using the Conditional Graph Logic Network, a conditional graphical model built upon graph neural networks.
arXiv Detail & Related papers (2020-01-06T05:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.