Related papers: ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots

ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots

URL: http://arxiv.org/abs/2407.10090v1
Date: Sun, 14 Jul 2024 05:53:18 GMT
Title: ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots
Authors: Ajnabiul Hoque, Manajit Das, Mayank Baranwal, Raghavan B. Sunoj,
Abstract summary: We develop an interpretable attention-based GNN that achieved near-unity and 96% accuracy for reaction step classification. Our model adeptly identifies key atom(s) even from out-of-distribution classes. This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.
Score: 4.362338454684645
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A chemical reaction mechanism (CRM) is a sequence of molecular-level events involving bond-breaking/forming processes, generating transient intermediates along the reaction pathway as reactants transform into products. Understanding such mechanisms is crucial for designing and discovering new reactions. One of the currently available methods to probe CRMs is quantum mechanical (QM) computations. The resource-intensive nature of QM methods and the scarcity of mechanism-based datasets motivated us to develop reliable ML models for predicting mechanisms. In this study, we created a comprehensive dataset with seven distinct classes, each representing uniquely characterized elementary steps. Subsequently, we developed an interpretable attention-based GNN that achieved near-unity and 96% accuracy, respectively for reaction step classification and the prediction of reactive atoms in each such step, capturing interactions between the broader reaction context and local active regions. The near-perfect classification enables accurate prediction of both individual events and the entire CRM, mitigating potential drawbacks of Seq2Seq approaches, where a wrongly predicted character leads to incoherent CRM identification. In addition to interpretability, our model adeptly identifies key atom(s) even from out-of-distribution classes. This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.

Related papers

Interpretable Deep Learning for Polar Mechanistic Reaction Prediction [43.95903801494905]
We introduce PMechRP (Polar Mechanistic Reaction Predictor), a system that trains machine learning models on the PMechDB dataset. We train compare a range of machine learning models, including transformer-based, graph-based and two-step siamese architectures. Our best-performing approach was a hybrid model, which combines a 5-ensemble of Chemformer models with a two-step Siamese framework.
arXiv Detail & Related papers (2025-04-22T02:31:23Z)
Transferable Learning of Reaction Pathways from Geometric Priors [1.3170830344441016]
MEPIN is a scalable machine-learning method for efficiently predicting MEPs from reactant and product geometries. Our method enables the exploration of large chemical reaction spaces with efficient, data-driven predictions of reaction pathways.
arXiv Detail & Related papers (2025-04-21T18:20:53Z)
Chemical knowledge-informed framework for privacy-aware retrosynthesis learning [60.93245342663455]
Current machine learning-based retrosynthesis gathers reaction data from multiple sources into one single edge to train prediction models. This paradigm poses considerable privacy risks as it necessitates broad data availability across organizational boundaries. In the present study, we introduce the chemical knowledge-informed framework (CKIF), a privacy-preserving approach for learning retrosynthesis models.
arXiv Detail & Related papers (2025-02-26T13:13:24Z)
Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
This paper introduces modelname, a novel chemical reaction representation learning model tailored for a variety of organic-reaction-related tasks. By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction, thereby enhancing the comprehension of the reaction mechanism. We have designed an adapter structure to incorporate reaction conditions into the chemical reaction representation, allowing the model to handle diverse reaction conditions and adapt to various datasets and downstream tasks, e.g., reaction performance prediction.
arXiv Detail & Related papers (2024-11-26T17:41:44Z)
log-RRIM: Yield Prediction via Local-to-global Reaction Representation Learning and Interaction Modeling [6.310759215182946]
log-RRIM is an innovative graph transformer-based framework designed for predicting chemical reaction yields. Our approach implements a unique local-to-global reaction representation learning strategy. Its advanced modeling of reactant-reagent interactions and sensitivity to small molecular fragments make it a valuable tool for reaction planning and optimization in chemical synthesis.
arXiv Detail & Related papers (2024-10-20T18:35:56Z)
Beyond Major Product Prediction: Reproducing Reaction Mechanisms with Machine Learning Models Trained on a Large-Scale Mechanistic Dataset [10.968137261042715]
Mechanistic understanding of organic reactions can facilitate reaction development, impurity prediction, and in principle, reaction discovery. While several machine learning models have sought to address the task of predicting reaction products, their extension to predicting reaction mechanisms has been impeded by the lack of a corresponding mechanistic dataset. We construct such a dataset by imputing intermediates between experimentally reported reactants and products using expert reaction templates and train several machine learning models on the resulting dataset of 5,184,184 elementary steps.
arXiv Detail & Related papers (2024-03-07T15:26:23Z)
Contextual Molecule Representation Learning from Chemical Reaction Knowledge [24.501564702095937]
We introduce REMO, a self-supervised learning framework that takes advantage of well-defined atom-combination rules in common chemistry. REMO pre-trains graph/Transformer encoders on 1.7 million known chemical reactions in the literature.
arXiv Detail & Related papers (2024-02-21T12:58:40Z)
Towards out-of-distribution generalizable predictions of chemical kinetics properties [61.15970601264632]
Out-Of-Distribution (OOD) kinetic property prediction is required to be generalizable. In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism) We create comprehensive datasets to benchmark the state-of-the-art ML approaches for reaction prediction in the OOD setting and the state-of-the-art graph OOD methods in kinetics property prediction problems.
arXiv Detail & Related papers (2023-10-04T20:36:41Z)
Towards Predicting Equilibrium Distributions for Molecular Systems with Deep Learning [60.02391969049972]
We introduce a novel deep learning framework, called Distributional Graphormer (DiG), in an attempt to predict the equilibrium distribution of molecular systems. DiG employs deep neural networks to transform a simple distribution towards the equilibrium distribution, conditioned on a descriptor of a molecular system.
arXiv Detail & Related papers (2023-06-08T17:12:08Z)
Multi-level Protocol for Mechanistic Reaction Studies Using Semi-local Fitted Potential Energy Surfaces [0.0]
We propose a multi-scale protocol for routine theoretical studies of chemical reaction mechanisms. The key aspect of the method's performance is its multi-scale nature, which not only saves computational effort but also allows extracting meaningful information.
arXiv Detail & Related papers (2023-04-03T12:55:29Z)
Sensing of magnetic field effects in radical-pair reactions using a quantum sensor [50.591267188664666]
Magnetic field effects (MFE) in certain chemical reactions have been well established in the last five decades. We employ elaborate and realistic models of radical-pairs, considering its coupling to the local spin environment and the sensor. For two model systems, we derive signals of MFE detectable even in the weak coupling regime between radical-pair and NV quantum sensor.
arXiv Detail & Related papers (2022-09-28T12:56:15Z)
Improving Molecular Representation Learning with Metric Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems. MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z)
Discovering Latent Causal Variables via Mechanism Sparsity: A New Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application. We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.