Reaction Prediction via Interaction Modeling of Symmetric Difference Shingle Sets
- URL: http://arxiv.org/abs/2511.06356v2
- Date: Sun, 16 Nov 2025 04:36:05 GMT
- Title: Reaction Prediction via Interaction Modeling of Symmetric Difference Shingle Sets
- Authors: Runhan Shi, Letian Chen, Gufeng Yu, Yang Yang,
- Abstract summary: We propose ReaDISH, a novel machine learning model for chemical reaction prediction.<n>It learns permutation-invariant representations while incorporating interaction-aware features.<n>It shows enhanced robustness with an average improvement of 8.76% on R$2$ under permutations.
- Score: 4.597922051722059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chemical reaction prediction remains a fundamental challenge in organic chemistry, where existing machine learning models face two critical limitations: sensitivity to input permutations (molecule/atom orderings) and inadequate modeling of substructural interactions governing reactivity. These shortcomings lead to inconsistent predictions and poor generalization to real-world scenarios. To address these challenges, we propose ReaDISH, a novel reaction prediction model that learns permutation-invariant representations while incorporating interaction-aware features. It introduces two innovations: (1) symmetric difference shingle encoding, which extends the differential reaction fingerprint (DRFP) by representing shingles as continuous high-dimensional embeddings, capturing structural changes while eliminating order sensitivity; and (2) geometry-structure interaction attention, a mechanism that models intra- and inter-molecular interactions at the shingle level. Extensive experiments demonstrate that ReaDISH improves reaction prediction performance across diverse benchmarks. It shows enhanced robustness with an average improvement of 8.76% on R$^2$ under permutation perturbations.
Related papers
- A Pre-trained Reaction Embedding Descriptor Capturing Bond Transformation Patterns [4.8838428804671326]
This study introduces RXNEmb, a novel reaction-level descriptor derived from RXNGraphormer.<n>We demonstrate its utility by data-driven re-clustering of the USPTO-50k dataset.<n> RXNEmb serves as a powerful, interpretable tool for reaction fingerprinting and analysis.
arXiv Detail & Related papers (2026-01-07T08:24:08Z) - Electron flow matching for generative reaction mechanism prediction obeying conservation laws [8.136277960071032]
This work recasts the problem of reaction prediction as a problem of electron redistribution using the modern deep generative framework of flow matching.<n>Our model, FlowER, overcomes limitations by enforcing exact mass conservation, thereby resolving hallucinatory failure modes.<n>FlowER additionally enables estimation of thermodynamic or kinetic feasibility and manifests a degree of chemical intuition in reaction prediction tasks.
arXiv Detail & Related papers (2025-02-18T16:01:17Z) - Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
RAlign is a novel chemical reaction representation learning model for various organic reaction-related tasks.<n>By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction.<n>We introduce a reaction-center-aware attention mechanism that enables the model to concentrate on key functional groups.
arXiv Detail & Related papers (2024-11-26T17:41:44Z) - ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots [4.362338454684645]
We develop an interpretable attention-based GNN that achieved near-unity and 96% accuracy for reaction step classification.
Our model adeptly identifies key atom(s) even from out-of-distribution classes.
This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.
arXiv Detail & Related papers (2024-07-14T05:53:18Z) - Neural Interaction Energy for Multi-Agent Trajectory Prediction [55.098754835213995]
We introduce a framework called Multi-Agent Trajectory prediction via neural interaction Energy (MATE)
MATE assesses the interactive motion of agents by employing neural interaction energy.
To bolster temporal stability, we introduce two constraints: inter-agent interaction constraint and intra-agent motion constraint.
arXiv Detail & Related papers (2024-04-25T12:47:47Z) - Beyond Major Product Prediction: Reproducing Reaction Mechanisms with
Machine Learning Models Trained on a Large-Scale Mechanistic Dataset [10.968137261042715]
Mechanistic understanding of organic reactions can facilitate reaction development, impurity prediction, and in principle, reaction discovery.
While several machine learning models have sought to address the task of predicting reaction products, their extension to predicting reaction mechanisms has been impeded by the lack of a corresponding mechanistic dataset.
We construct such a dataset by imputing intermediates between experimentally reported reactants and products using expert reaction templates and train several machine learning models on the resulting dataset of 5,184,184 elementary steps.
arXiv Detail & Related papers (2024-03-07T15:26:23Z) - 3DReact: Geometric deep learning for chemical reactions [35.38031930589095]
We introduce 3DReact, a deep geometric learning model to predict reaction properties from three-dimensional structures of reactants and products.
We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information.
It performs systematically well across different datasets, atom-mapping regimes, as well as both geometries and extrapolation tasks.
arXiv Detail & Related papers (2023-12-13T17:26:54Z) - Beyond the Typical: Modeling Rare Plausible Patterns in Chemical Reactions by Leveraging Sequential Mixture-of-Experts [42.9784548283531]
Generative models like Transformer and VAE have typically been employed to predict the reaction product.
We propose organizing the mapping space between reactants and electron redistribution patterns in a divide-and-conquer manner.
arXiv Detail & Related papers (2023-10-07T03:18:26Z) - Doubly Stochastic Graph-based Non-autoregressive Reaction Prediction [59.41636061300571]
We propose a new framework called that combines two doubly self-attention mappings to obtain electron redistribution predictions.
We show that our approach consistently improves the predictive performance of non-autoregressive models.
arXiv Detail & Related papers (2023-06-05T14:15:39Z) - Differentiable Programming of Chemical Reaction Networks [63.948465205530916]
Chemical reaction networks are one of the most fundamental computational substrates used by nature.
We study well-mixed single-chamber systems, as well as systems with multiple chambers separated by membranes.
We demonstrate that differentiable optimisation, combined with proper regularisation, can discover non-trivial sparse reaction networks.
arXiv Detail & Related papers (2023-02-06T11:41:14Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - Discovering Latent Causal Variables via Mechanism Sparsity: A New
Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application.
We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z) - Energy-based View of Retrosynthesis [70.66156081030766]
We propose a framework that unifies sequence- and graph-based methods as energy-based models.
We present a novel dual variant within the framework that performs consistent training over Bayesian forward- and backward-prediction.
This model improves state-of-the-art performance by 9.6% for template-free approaches where the reaction type is unknown.
arXiv Detail & Related papers (2020-07-14T18:51:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.