Explainable Data-driven Modeling of Adsorption Energy in Heterogeneous Catalysis
- URL: http://arxiv.org/abs/2405.20397v1
- Date: Thu, 30 May 2024 18:06:14 GMT
- Title: Explainable Data-driven Modeling of Adsorption Energy in Heterogeneous Catalysis
- Authors: Tirtha Vinchurkar, Janghoon Ock, Amir Barati Farimani,
- Abstract summary: This study aims to bridge the gap between physics-based studies and data-driven methodologies.
We employ two XAI techniques: Post-hoc XAI analysis and Symbolic Regression.
Our work establishes a robust framework that integrates machine learning techniques with XAI.
- Score: 6.349503549199403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing popularity of machine learning (ML) in catalysis has spurred interest in leveraging these techniques to enhance catalyst design. Our study aims to bridge the gap between physics-based studies and data-driven methodologies by integrating ML techniques with eXplainable AI (XAI). Specifically, we employ two XAI techniques: Post-hoc XAI analysis and Symbolic Regression. These techniques help us unravel the correlation between adsorption energy and the properties of the adsorbate-catalyst system. Leveraging a large dataset such as the Open Catalyst Dataset (OC20), we employ a combination of shallow ML techniques and XAI methodologies. Our investigation involves utilizing multiple shallow machine learning techniques to predict adsorption energy, followed by post-hoc analysis for feature importance, inter-feature correlations, and the influence of various feature values on the prediction of adsorption energy. The post-hoc analysis reveals that adsorbate properties exert a greater influence than catalyst properties in our dataset. The top five features based on higher Shapley values are adsorbate electronegativity, the number of adsorbate atoms, catalyst electronegativity, effective coordination number, and the sum of atomic numbers of the adsorbate molecule. There is a positive correlation between catalyst and adsorbate electronegativity with the prediction of adsorption energy. Additionally, symbolic regression yields results consistent with SHAP analysis. It deduces a mathematical relationship indicating that the square of the catalyst electronegativity is directly proportional to the adsorption energy. These consistent correlations resemble those derived from physics-based equations in previous research. Our work establishes a robust framework that integrates ML techniques with XAI, leveraging large datasets like OC20 to enhance catalyst design through model explainability.
Related papers
- Adsorb-Agent: Autonomous Identification of Stable Adsorption Configurations via Large Language Model Agent [5.812284760539713]
Adsorb-Agent is a Large Language Model (LLM) agent designed to efficiently derive system-specific stable adsorbate-catalyst configurations.
We demonstrate its performance using two example systems, NNH-CuPd3 (111) and NNH-Mo3Pd (111), for the Nitrogen Reduction Reaction (NRR), a sustainable alternative to the Haber-Bosch process.
arXiv Detail & Related papers (2024-10-22T03:19:16Z) - A Machine Learning and Explainable AI Framework Tailored for Unbalanced Experimental Catalyst Discovery [10.92613600218535]
We introduce a robust machine learning and explainable AI (XAI) framework to accurately classify the catalytic yield of various compositions.
This framework combines a series of ML practices designed to handle the scarcity and imbalance of catalyst data.
We believe that such insights can assist chemists in the development and identification of novel catalysts with superior performance.
arXiv Detail & Related papers (2024-07-10T13:09:53Z) - On the importance of catalyst-adsorbate 3D interactions for relaxed
energy predictions [98.70797778496366]
We investigate whether it is possible to predict a system's relaxed energy in the OC20 dataset while ignoring the relative position of the adsorbate.
We find that while removing binding site information impairs accuracy as expected, modified models are able to predict relaxed energies with remarkably decent MAE.
arXiv Detail & Related papers (2023-10-10T14:57:04Z) - Towards out-of-distribution generalizable predictions of chemical
kinetics properties [61.15970601264632]
Out-Of-Distribution (OOD) kinetic property prediction is required to be generalizable.
In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism)
We create comprehensive datasets to benchmark the state-of-the-art ML approaches for reaction prediction in the OOD setting and the state-of-the-art graph OOD methods in kinetics property prediction problems.
arXiv Detail & Related papers (2023-10-04T20:36:41Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Clarifying Trust of Materials Property Predictions using Neural Networks
with Distribution-Specific Uncertainty Quantification [16.36620228609086]
Uncertainty (UQ) methods allow estimation of the trustworthiness of machine learning (ML) model predictions.
Here, we investigate different UQ methods applied to predict energies of molecules on alloys from the Open Catalyst 2020 dataset.
Evidential regression is demonstrated to be a powerful approach for rapidly obtaining, competitively trustworthy UQ estimates.
arXiv Detail & Related papers (2023-02-06T07:03:02Z) - BIGDML: Towards Exact Machine Learning Force Fields for Materials [55.944221055171276]
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof.
Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 atoms.
arXiv Detail & Related papers (2021-06-08T10:14:57Z) - Quantitative Prediction on the Enantioselectivity of Multiple Chiral
Iodoarene Scaffolds Based on Whole Geometry [4.042350304426974]
We introduce a predictive workflow for the extension of the reaction scope of chiral catalysts across name reactions.
Whole geometry descriptors were encoded from DFT optimized 3D structures of multiple catalyst scaffolds.
For the consensus prediction of ensemble models, this global descriptor can be compared with sterimol parameters and noncovalent interaction.
arXiv Detail & Related papers (2021-03-25T20:08:56Z) - MLSolv-A: A Novel Machine Learning-Based Prediction of Solvation Free
Energies from Pairwise Atomistic Interactions [0.0]
We introduce a novel ML-based solvation model, which calculates the solvation energy from pairwise atomistic interactions.
Results on 6,493 experimental measurements achieve outstanding performance and transferability for enlarging training data.
arXiv Detail & Related papers (2020-05-13T06:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.