Quantum Mechanics and Machine Learning Synergies: Graph Attention Neural
Networks to Predict Chemical Reactivity
- URL: http://arxiv.org/abs/2103.14536v1
- Date: Wed, 24 Mar 2021 19:05:02 GMT
- Title: Quantum Mechanics and Machine Learning Synergies: Graph Attention Neural
Networks to Predict Chemical Reactivity
- Authors: Mohammadamin Tavakoli, Aaron Mood, David Van Vranken, Pierre Baldi
- Abstract summary: There is a lack of scalable quantitative measures of reactivity for functional groups in organic chemistry.
In previous quantum chemistry studies, we have introduced Methyl Cation Affinities (MCA*) and Methyl Anion Affinities (MAA*)
MCA* and MAA* offer good estimates of reactivity parameters, their calculation through Density Functional Theory (DFT) simulations is time-consuming.
- Score: 7.799648230758492
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a lack of scalable quantitative measures of reactivity for
functional groups in organic chemistry. Measuring reactivity experimentally is
costly and time-consuming and does not scale to the astronomical size of
chemical space. In previous quantum chemistry studies, we have introduced
Methyl Cation Affinities (MCA*) and Methyl Anion Affinities (MAA*), using a
solvation model, as quantitative measures of reactivity for organic functional
groups over the broadest range. Although MCA* and MAA* offer good estimates of
reactivity parameters, their calculation through Density Functional Theory
(DFT) simulations is time-consuming. To circumvent this problem, we first use
DFT to calculate MCA* and MAA* for more than 2,400 organic molecules thereby
establishing a large dataset of chemical reactivity scores. We then design deep
learning methods to predict the reactivity of molecular structures and train
them using this curated dataset in combination with different representations
of molecular structures. Using ten-fold cross-validation, we show that graph
attention neural networks applied to informative input fingerprints produce the
most accurate estimates of reactivity, achieving over 91% test accuracy for
predicting the MCA* plus-minus 3.0 or MAA* plus-minus 3.0, over 50 orders of
magnitude. Finally, we demonstrate the application of these reactivity scores
to two tasks: (1) chemical reaction prediction; (2) combinatorial generation of
reaction mechanisms. The curated dataset of MCA* and MAA* scores is available
through the ChemDB chemoinformatics web portal at www.cdb.ics.uci.edu.
Related papers
- $\
abla^2$DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials [35.949502493236146]
This work presents a new dataset and benchmark called $nabla2$DFT that is based on the nablaDFT.<n>It contains twice as much molecular structures, three times more conformations, new data types and tasks, and state-of-the-art models.<n>$nabla2$DFT is the first dataset that contains relaxation trajectories for a substantial number of drug-like molecules.
arXiv Detail & Related papers (2024-06-20T14:14:59Z) - QH9: A Quantum Hamiltonian Prediction Benchmark for QM9 Molecules [69.25826391912368]
We generate a new Quantum Hamiltonian dataset, named as QH9, to provide precise Hamiltonian matrices for 999 or 2998 molecular dynamics trajectories.
We show that current machine learning models have the capacity to predict Hamiltonian matrices for arbitrary molecules.
arXiv Detail & Related papers (2023-06-15T23:39:07Z) - Efficient Chemical Space Exploration Using Active Learning Based on
Marginalized Graph Kernel: an Application for Predicting the Thermodynamic
Properties of Alkanes with Molecular Simulation [10.339394156446982]
We use molecular dynamics simulation to generate data and graph neural network (GNN) to predict.
In specific, targeting 251,728 alkane molecules consisting of 4 to 19 carbon atoms and their liquid physical properties.
validation shows that only 313 molecules were sufficient to train an accurate GNN model with $rm R2 > 0.99$ for computational test sets and $rm R2 > 0.94$ for experimental test sets.
arXiv Detail & Related papers (2022-09-01T14:59:13Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - Improving Molecular Representation Learning with Metric
Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems.
MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z) - Chemical-Reaction-Aware Molecule Representation Learning [88.79052749877334]
We propose using chemical reactions to assist learning molecule representation.
Our approach is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings.
Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks.
arXiv Detail & Related papers (2021-09-21T00:08:43Z) - A Universal Framework for Featurization of Atomistic Systems [0.0]
Reactive force fields based on physics or machine learning can be used to bridge the gap in time and length scales.
We introduce the Gaussian multi-pole (GMP) featurization scheme that utilizes physically-relevant multi-pole expansions of the electron density around atoms.
We demonstrate that GMP-based models can achieve chemical accuracy for the QM9 dataset, and their accuracy remains reasonable even when extrapolating to new elements.
arXiv Detail & Related papers (2021-02-04T03:11:00Z) - Unassisted Noise Reduction of Chemical Reaction Data Sets [59.127921057012564]
We propose a machine learning-based, unassisted approach to remove chemically wrong entries from data sets.
Our results show an improved prediction quality for models trained on the cleaned and balanced data sets.
arXiv Detail & Related papers (2021-02-02T09:34:34Z) - End-to-End Differentiable Molecular Mechanics Force Field Construction [0.5269923665485903]
We propose an alternative approach that uses graph neural networks to perceive chemical environments.
The entire process is modular and end-to-end differentiable with respect to model parameters.
We show that this approach is not only sufficiently to reproduce legacy atom types, but that it can learn to accurately reproduce and extend existing molecular mechanics force fields.
arXiv Detail & Related papers (2020-10-02T20:59:46Z) - ASGN: An Active Semi-supervised Graph Neural Network for Molecular
Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules.
In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution.
At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.