Related papers: A KAN-based Interpretable Framework for Process-Informed Prediction of Global Warming Potential

A KAN-based Interpretable Framework for Process-Informed Prediction of Global Warming Potential

URL: http://arxiv.org/abs/2411.00426v1
Date: Fri, 01 Nov 2024 07:48:05 GMT
Title: A KAN-based Interpretable Framework for Process-Informed Prediction of Global Warming Potential
Authors: Jaewook Lee, Xinyang Sun, Ethan Errington, Miao Guo,
Abstract summary: We present an integrative Global Warming Potential (GWP) prediction model that combines molecular descriptors with process information. Using a deep neural network (DNN) model, we achieved an R-squared of 86% on test data with Mordred descriptors, process location, and description information. Our results suggest that integrating both molecular and process-level information in GWP prediction models yields substantial gains in accuracy and interpretability.
Score: 2.8248953889934953
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate prediction of Global Warming Potential (GWP) is essential for assessing the environmental impact of chemical processes and materials. Traditional GWP prediction models rely predominantly on molecular structure, overlooking critical process-related information. In this study, we present an integrative GWP prediction model that combines molecular descriptors (MACCS keys and Mordred descriptors) with process information (process title, description, and location) to improve predictive accuracy and interpretability. Using a deep neural network (DNN) model, we achieved an R-squared of 86% on test data with Mordred descriptors, process location, and description information, representing a 25% improvement over the previous benchmark of 61%; XAI analysis further highlighted the significant role of process title embeddings in enhancing model predictions. To enhance interpretability, we employed a Kolmogorov-Arnold Network (KAN) to derive a symbolic formula for GWP prediction, capturing key molecular and process features and providing a transparent, interpretable alternative to black-box models, enabling users to gain insights into the molecular and process factors influencing GWP. Error analysis showed that the model performs reliably in densely populated data ranges, with increased uncertainty for higher GWP values. This analysis allows users to manage prediction uncertainty effectively, supporting data-driven decision-making in chemical and process design. Our results suggest that integrating both molecular and process-level information in GWP prediction models yields substantial gains in accuracy and interpretability, offering a valuable tool for sustainability assessments. Future work may extend this approach to additional environmental impact categories and refine the model to further enhance its predictive reliability.

Related papers

Deep Learning for GWP Prediction: A Framework Using PCA, Quantile Transformation, and Ensemble Modeling [0.0]
This study estimates the 100-year global warming potential (GWP 100) of single-component refrigerants using a fully connected neural network. The RDKit-based model achieved the best performance, with a Root Mean Square Error (RMSE) of 481.9 and an R2 score of 0.918. Factor analysis identified vital molecular features, including molecular weight, lipophilicity, and functional groups, such as nitriles and allylic oxides, as significant contributors to GWP values.
arXiv Detail & Related papers (2024-11-28T13:16:12Z)
Hybrid Gaussian Process Regression with Temporal Feature Extraction for Partially Interpretable Remaining Useful Life Interval Prediction in Aeroengine Prognostics [0.615155791092452]
This paper introduces a modified Gaussian Process Regression (GPR) model for Remaining Useful Life (RUL) interval prediction. The modified GPR predicts confidence intervals by learning from historical data and addresses uncertainty modeling in a more structured way. It effectively captures intricate time-series patterns and dynamic behaviors inherent in modern manufacturing systems.
arXiv Detail & Related papers (2024-11-19T03:00:02Z)
Leveraging data-driven weather models for improving numerical weather prediction skill through large-scale spectral nudging [1.747339718564314]
This study illustrates the relative strengths and weaknesses of physics-based and AI-based approaches to weather prediction. A hybrid NWP-AI system is proposed, wherein GEM-predicted large-scale state variables are spectrally nudged toward GraphCast predictions. Results indicate that this hybrid approach is capable of leveraging the strengths of GraphCast to enhance the prediction skill of the GEM model.
arXiv Detail & Related papers (2024-07-08T16:39:25Z)
F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm. By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases. Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z)
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM) CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models. We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z)
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion. It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space. It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z)
Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search [2.711812013460678]
We introduce AutoGNNUQ, an automated uncertainty quantification (UQ) approach for molecular property prediction. Our approach employs variance decomposition to separate data (aleatoric) and model (epistemic) uncertainties, providing valuable insights for reducing them. AutoGNNUQ has broad applicability in domains such as drug discovery and materials science, where accurate uncertainty quantification is crucial for decision-making.
arXiv Detail & Related papers (2023-07-19T20:03:42Z)
Learning inducing points and uncertainty on molecular data by scalable variational Gaussian processes [0.0]
We show that variational learning of the inducing points in a molecular descriptor space improves the prediction of energies and atomic forces on two molecular dynamics datasets. We extend our study to a large molecular crystal system, showing that variational GP models perform well for predicting atomic forces by efficiently learning a sparse representation of the dataset.
arXiv Detail & Related papers (2022-07-16T10:41:41Z)
Preference Enhanced Social Influence Modeling for Network-Aware Cascade Prediction [59.221668173521884]
We propose a novel framework to promote cascade size prediction by enhancing the user preference modeling. Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate.
arXiv Detail & Related papers (2022-04-18T09:25:06Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions. Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations. We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.