Related papers: Understanding surrogate explanations: the interplay between complexity, fidelity and coverage

Understanding surrogate explanations: the interplay between complexity, fidelity and coverage

URL: http://arxiv.org/abs/2107.04309v1
Date: Fri, 9 Jul 2021 08:43:31 GMT
Title: Understanding surrogate explanations: the interplay between complexity, fidelity and coverage
Authors: Rafael Poyiadzi, Xavier Renard, Thibault Laugel, Raul Santos-Rodriguez, Marcin Detyniecki
Abstract summary: We show that transitioning from global to local - reducing coverage - allows for more favourable conditions. We discuss the interplay between complexity, fidelity and coverage, and consider how different user needs can lead to problem formulations.
Score: 5.094061357656677
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper analyses the fundamental ingredients behind surrogate explanations to provide a better understanding of their inner workings. We start our exposition by considering global surrogates, describing the trade-off between complexity of the surrogate and fidelity to the black-box being modelled. We show that transitioning from global to local - reducing coverage - allows for more favourable conditions on the Pareto frontier of fidelity-complexity of a surrogate. We discuss the interplay between complexity, fidelity and coverage, and consider how different user needs can lead to problem formulations where these are either constraints or penalties. We also present experiments that demonstrate how the local surrogate interpretability procedure can be made interactive and lead to better explanations.

Related papers

Explanations Go Linear: Interpretable and Individual Latent Encoding for Post-hoc Explainability [8.96728156164206]
Post-hoc explainability is essential for understanding black-box machine learning models. We present ILLUME, a flexible and interpretable framework grounded in representation learning. Our approach combines a globally trained surrogate with instance-specific linear transformations learned with a meta-encoder to generate both local and global explanations.
arXiv Detail & Related papers (2025-04-29T11:46:48Z)
Layered Chain-of-Thought Prompting for Multi-Agent LLM Systems: A Comprehensive Approach to Explainable Large Language Models [0.0]
We propose Layered Chain-of-Thought (Layered-CoT) Prompting, a novel framework that systematically segments the reasoning process into multiple layers. We present three scenarios -- medical triage, financial risk assessment, and agile engineering -- and demonstrate how Layered-CoT surpasses vanilla CoT in terms of transparency, correctness, and user engagement.
arXiv Detail & Related papers (2025-01-29T13:21:09Z)
Towards Understanding Extrapolation: a Causal Lens [53.15488984371969]
We provide a theoretical understanding of when extrapolation is possible and offer principled methods to achieve it. Under this formulation, we cast the extrapolation problem into a latent-variable identification problem. Our theory reveals the intricate interplay between the underlying manifold's smoothness and the shift properties.
arXiv Detail & Related papers (2025-01-15T21:29:29Z)
Augmenting the Veracity and Explanations of Complex Fact Checking via Iterative Self-Revision with LLMs [10.449165630417522]
We construct two complex fact-checking datasets in the Chinese scenarios: CHEF-EG and TrendFact. These datasets involve complex facts in areas such as health, politics, and society. We propose a unified framework called FactISR to perform mutual feedback between veracity and explanations.
arXiv Detail & Related papers (2024-10-19T15:25:19Z)
Advancing Interactive Explainable AI via Belief Change Theory [5.842480645870251]
We argue that this type of formalisation provides a framework and a methodology to develop interactive explanations. We first define a novel, logic-based formalism to represent explanatory information shared between humans and machines. We then consider real world scenarios for interactive XAI, with different prioritisations of new and existing knowledge, where our formalism may be instantiated.
arXiv Detail & Related papers (2024-08-13T13:11:56Z)
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains. The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications. We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z)
Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations [62.48505112245388]
We take an in-depth look at the causal awareness of modern representations of agent interactions. We show that recent representations are already partially resilient to perturbations of non-causal agents. We propose a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z)
Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z)
Why Does Little Robustness Help? Understanding and Improving Adversarial Transferability from Surrogate Training [24.376314203167016]
Adversarial examples (AEs) for DNNs have been shown to be transferable. In this paper, we take a further step towards understanding adversarial transferability.
arXiv Detail & Related papers (2023-07-15T19:20:49Z)
Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances. We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z)
Global Counterfactual Explanations: Investigations, Implementations and Improvements [12.343333815270402]
Actionable Recourse Summaries (AReS) is the only known global counterfactual explanation framework for recourse. This paper focuses on implementing and improving AReS, the only known global counterfactual explanation framework for recourse.
arXiv Detail & Related papers (2022-04-14T12:21:23Z)
Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
Counterfactual explanations aim to provide to end users a set of features that need to be changed in order to achieve a desired outcome. Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations. We present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations.
arXiv Detail & Related papers (2021-06-14T20:48:48Z)
Fundamental Limits and Tradeoffs in Invariant Representation Learning [99.2368462915979]
Many machine learning applications involve learning representations that achieve two competing goals. Minimax game-theoretic formulation represents a fundamental tradeoff between accuracy and invariance. We provide an information-theoretic analysis of this general and important problem under both classification and regression settings.
arXiv Detail & Related papers (2020-12-19T15:24:04Z)
A framework for step-wise explaining how to solve constraint satisfaction problems [21.96171133035504]
We study the problem of explaining the inference steps that one can take during propagation, in a way that is easy to interpret for a person. Thereby, we aim to give the constraint solver explainable agency, which can help in building trust in the solver.
arXiv Detail & Related papers (2020-06-11T11:35:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.