Related papers: Explaining Machine Learning Predictive Models through Conditional Expectation Methods

Explaining Machine Learning Predictive Models through Conditional Expectation Methods

URL: http://arxiv.org/abs/2601.07313v1
Date: Mon, 12 Jan 2026 08:34:36 GMT
Title: Explaining Machine Learning Predictive Models through Conditional Expectation Methods
Authors: Silvia Ruiz-España, Laura Arnal, François Signol, Juan-Carlos Perez-Cortes, Joaquim Arlandis,
Abstract summary: MUCE is a model-agnostic method for local explainability designed to capture prediction changes from feature interactions.<n>Two quantitative indices, stability and uncertainty, summarize local behavior and assess model reliability.<n>Results show that MUCE effectively captures complex local model behavior, while the stability and uncertainty indices provide meaningful insight into prediction confidence.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The rapid adoption of complex Artificial Intelligence (AI) and Machine Learning (ML) models has led to their characterization as black boxes due to the difficulty of explaining their internal decision-making processes. This lack of transparency hinders users' ability to understand, validate and trust model behavior, particularly in high-risk applications. Although explainable AI (XAI) has made significant progress, there remains a need for versatile and effective techniques to address increasingly complex models. This work introduces Multivariate Conditional Expectation (MUCE), a model-agnostic method for local explainability designed to capture prediction changes from feature interactions. MUCE extends Individual Conditional Expectation (ICE) by exploring a multivariate grid of values in the neighborhood of a given observation at inference time, providing graphical explanations that illustrate the local evolution of model predictions. In addition, two quantitative indices, stability and uncertainty, summarize local behavior and assess model reliability. Uncertainty is further decomposed into uncertainty+ and uncertainty- to capture asymmetric effects that global measures may overlook. The proposed method is validated using XGBoost models trained on three datasets: two synthetic (2D and 3D) to evaluate behavior near decision boundaries, and one transformed real-world dataset to test adaptability to heterogeneous feature types. Results show that MUCE effectively captures complex local model behavior, while the stability and uncertainty indices provide meaningful insight into prediction confidence. MUCE, together with the ICE modification and the proposed indices, offers a practical contribution to local explainability, enabling both graphical and quantitative insights that enhance the interpretability of predictive models and support more trustworthy and transparent decision-making.

Related papers

D-Models and E-Models: Diversity-Stability Trade-offs in the Sampling Behavior of Large Language Models [91.21455683212224]
In large language models (LLMs), the probability of relevance for the next piece of information is linked to the probability of relevance for the next product.<n>But whether fine-grained sampling probabilities faithfully align with task requirements remains an open question.<n>We identify two model types: D-models, whose P_token exhibits large step-to-step variability and poor alignment with P_task; and E-models, whose P_token is more stable and better aligned with P_task.
arXiv Detail & Related papers (2026-01-25T14:59:09Z)
From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models [77.04403907729738]
This survey charts the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior.<n>We demonstrate how uncertainty is leveraged as an active control signal across three frontiers.<n>This survey argues that mastering the new trend of uncertainty is essential for building the next generation of scalable, reliable, and trustworthy AI.
arXiv Detail & Related papers (2026-01-22T06:21:31Z)
Understanding GUI Agent Localization Biases through Logit Sharpness [15.986679553468989]
Multimodal large language models (MLLMs) have enabled GUI agents to interact with operating systems by grounding language into spatial actions.<n>Despite their promising performance, these models frequently exhibit hallucinations-systematic localization errors that compromise reliability.<n>We propose a fine-grained evaluation framework that categorizes model predictions into four distinct types, revealing nuanced failure modes beyond traditional accuracy metrics.
arXiv Detail & Related papers (2025-06-18T12:55:35Z)
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors [61.92704516732144]
We show that the most robust features for correctness prediction are those that play a distinctive causal role in the model's behavior.<n>We propose two methods that leverage causal mechanisms to predict the correctness of model outputs.
arXiv Detail & Related papers (2025-05-17T00:31:39Z)
Uncertainty Quantification for Transformer Models for Dark-Pattern Detection [0.21427777919040417]
This study focuses on dark-pattern detection, deceptive design choices that manipulate user decisions, undermining autonomy and consent.<n>We propose a differential fine-tuning approach implemented at the final classification head via uncertainty quantification with transformer-based pre-trained models.
arXiv Detail & Related papers (2024-12-06T18:31:51Z)
UAHOI: Uncertainty-aware Robust Interaction Learning for HOI Detection [18.25576487115016]
This paper focuses on Human-Object Interaction (HOI) detection. It addresses the challenge of identifying and understanding the interactions between humans and objects within a given image or video frame. We propose a novel approach textscUAHOI, Uncertainty-aware Robust Human-Object Interaction Learning.
arXiv Detail & Related papers (2024-08-14T10:06:39Z)
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model. We introduce three robustness indicators and conduct experiments across diverse robust datasets. Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations [1.0370398945228227]
We introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for machine learning models. The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features. Our approach offers causal explanations and outperforms LIME and SHAP in terms of local accuracy and consistency of explained features.
arXiv Detail & Related papers (2023-10-01T04:09:59Z)
Variational Voxel Pseudo Image Tracking [127.46919555100543]
Uncertainty estimation is an important task for critical problems, such as robotics and autonomous driving. We propose a Variational Neural Network-based version of a Voxel Pseudo Image Tracking (VPIT) method for 3D Single Object Tracking.
arXiv Detail & Related papers (2023-02-12T13:34:50Z)
Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances. We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z)
Transformer Uncertainty Estimation with Hierarchical Stochastic Attention [8.95459272947319]
We propose a novel way to enable transformers to have the capability of uncertainty estimation. This is achieved by learning a hierarchical self-attention that attends to values and a set of learnable centroids. We empirically evaluate our model on two text classification tasks with both in-domain (ID) and out-of-domain (OOD) datasets.
arXiv Detail & Related papers (2021-12-27T16:43:31Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
Accurate and Robust Feature Importance Estimation under Distribution Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method. We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.