Related papers: Improving Counterfactual Truthfulness for Molecular Property Prediction through Uncertainty Quantification

Improving Counterfactual Truthfulness for Molecular Property Prediction through Uncertainty Quantification

URL: http://arxiv.org/abs/2504.02606v1
Date: Thu, 03 Apr 2025 14:07:30 GMT
Title: Improving Counterfactual Truthfulness for Molecular Property Prediction through Uncertainty Quantification
Authors: Jonas Teufel, Annika Leinweber, Pascal Friederich,
Abstract summary: XAI interventions aim to improve interpretability for complex black-box models.<n>In molecular property prediction, counterfactual explanations offer a way to understand predictive behavior.<n>We propose the integration of uncertainty estimation techniques to filter counterfactual candidates with high predicted uncertainty.
Score: 0.6144680854063939
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Explainable AI (xAI) interventions aim to improve interpretability for complex black-box models, not only to improve user trust but also as a means to extract scientific insights from high-performing predictive systems. In molecular property prediction, counterfactual explanations offer a way to understand predictive behavior by highlighting which minimal perturbations in the input molecular structure cause the greatest deviation in the predicted property. However, such explanations only allow for meaningful scientific insights if they reflect the distribution of the true underlying property -- a feature we define as counterfactual truthfulness. To increase this truthfulness, we propose the integration of uncertainty estimation techniques to filter counterfactual candidates with high predicted uncertainty. Through computational experiments with synthetic and real-world datasets, we demonstrate that traditional uncertainty estimation methods, such as ensembles and mean-variance estimation, can already substantially reduce the average prediction error and increase counterfactual truthfulness, especially for out-of-distribution settings. Our results highlight the importance and potential impact of incorporating uncertainty estimation into explainability methods, especially considering the relatively high effectiveness of low-effort interventions like model ensembles.

Related papers

Robust Molecular Property Prediction via Densifying Scarce Labeled Data [51.55434084913129]
In drug discovery, compounds most critical for advancing research often lie beyond the training set.<n>We propose a novel meta-learning-based approach that leverages unlabeled data to interpolate between in-distribution (ID) and out-of-distribution (OOD) data.<n>We demonstrate significant performance gains on challenging real-world datasets.
arXiv Detail & Related papers (2025-06-13T15:27:40Z)
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [76.15163242945813]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence.<n>We introduce a novel generative model that generates tokens on the basis of human-interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
Ensured: Explanations for Decreasing the Epistemic Uncertainty in Predictions [1.2289361708127877]
Epistem uncertainty adds a crucial dimension to explanation quality. We introduce new types of explanations that specifically target this uncertainty. We introduce a new metric, ensured ranking, designed to help users identify the most reliable explanations.
arXiv Detail & Related papers (2024-10-07T20:21:51Z)
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM) CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models. We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z)
Investigating the Impact of Model Instability on Explanations and Uncertainty [43.254616360807496]
We simulate uncertainty in text input by introducing noise at inference time. We find that high uncertainty doesn't necessarily imply low explanation plausibility. This suggests that noise-augmented models may be better at identifying salient tokens when uncertain.
arXiv Detail & Related papers (2024-02-20T13:41:21Z)
Model-agnostic variable importance for predictive uncertainty: an entropy-based approach [1.912429179274357]
We show how existing methods in explainability can be extended to uncertainty-aware models. We demonstrate the utility of these approaches to understand both the sources of uncertainty and their impact on model performance.
arXiv Detail & Related papers (2023-10-19T15:51:23Z)
Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation for Earth System Science Applications [0.32302664881848275]
Evidential deep learning is a technique that extends parametric deep learning to higher-order distributions. This study compares the uncertainty derived from evidential neural networks to those obtained from ensembles. We show evidential deep learning models attaining predictive accuracy rivaling standard methods, while robustly quantifying both sources of uncertainty.
arXiv Detail & Related papers (2023-09-22T23:04:51Z)
Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity. The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks [5.439020425819001]
Post-hoc interpretability methods attempt to make the inner workings of deep neural networks more interpretable. One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method. We propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training.
arXiv Detail & Related papers (2023-03-02T19:05:46Z)
Prediction-Powered Inference [68.97619568620709]
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system. The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients. Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning.
arXiv Detail & Related papers (2023-01-23T18:59:28Z)
Assigning Confidence to Molecular Property Prediction [1.015785232738621]
Machine learning has emerged as a powerful strategy to learn from existing datasets and perform predictions on unseen molecules. We discuss popular strategies for predicting molecular properties relevant to drug design, their corresponding uncertainty sources and methods to quantify uncertainty and confidence.
arXiv Detail & Related papers (2021-02-23T01:03:48Z)
DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner. We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z)
Accurate and Robust Feature Importance Estimation under Distribution Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method. We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.