Global Counterfactual Explanations: Investigations, Implementations and
Improvements
- URL: http://arxiv.org/abs/2204.06917v1
- Date: Thu, 14 Apr 2022 12:21:23 GMT
- Title: Global Counterfactual Explanations: Investigations, Implementations and
Improvements
- Authors: Dan Ley, Saumitra Mishra, Daniele Magazzeni
- Abstract summary: Actionable Recourse Summaries (AReS) is the only known global counterfactual explanation framework for recourse.
This paper focuses on implementing and improving AReS, the only known global counterfactual explanation framework for recourse.
- Score: 12.343333815270402
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Counterfactual explanations have been widely studied in explainability, with
a range of application dependent methods emerging in fairness, recourse and
model understanding. However, the major shortcoming associated with these
methods is their inability to provide explanations beyond the local or
instance-level. While some works touch upon the notion of a global explanation,
typically suggesting to aggregate masses of local explanations in the hope of
ascertaining global properties, few provide frameworks that are either reliable
or computationally tractable. Meanwhile, practitioners are requesting more
efficient and interactive explainability tools. We take this opportunity to
investigate existing global methods, with a focus on implementing and improving
Actionable Recourse Summaries (AReS), the only known global counterfactual
explanation framework for recourse.
Related papers
- Towards Few-Shot Learning in the Open World: A Review and Beyond [52.41344813375177]
Few-shot learning aims to mimic human intelligence by enabling significant generalizations and transferability.
This paper presents a review of recent advancements designed to adapt FSL for use in open-world settings.
We categorize existing methods into three distinct types of open-world few-shot learning: those involving varying instances, varying classes, and varying distributions.
arXiv Detail & Related papers (2024-08-19T06:23:21Z) - GLEAMS: Bridging the Gap Between Local and Global Explanations [6.329021279685856]
We propose GLEAMS, a novel method that partitions the input space and learns an interpretable model within each sub-region.
We demonstrate GLEAMS' effectiveness on both synthetic and real-world data, highlighting its desirable properties and human-understandable insights.
arXiv Detail & Related papers (2024-08-09T13:30:37Z) - Sparsity-Guided Holistic Explanation for LLMs with Interpretable
Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications.
We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - GLOBE-CE: A Translation-Based Approach for Global Counterfactual
Explanations [10.276136171459731]
Global & Efficient Counterfactual Explanations (GLOBE-CE) is a flexible framework that tackles the reliability and scalability issues associated with current state-of-the-art.
We provide a unique mathematical analysis of categorical feature translations, utilising it in our method.
Experimental evaluation with publicly available datasets and user studies demonstrate that GLOBE-CE performs significantly better than the current state-of-the-art.
arXiv Detail & Related papers (2023-05-26T15:26:59Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - ExSum: From Local Explanations to Model Understanding [6.23934576145261]
Interpretability methods are developed to understand the working mechanisms of black-box models.
Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them.
We introduce explanation summary (ExSum), a mathematical framework for quantifying model understanding.
arXiv Detail & Related papers (2022-04-30T02:07:20Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Best of both worlds: local and global explanations with
human-understandable concepts [10.155485106226754]
Interpretability techniques aim to provide the rationale behind a model's decision, typically by explaining either an individual prediction or a class of predictions.
We show that our method improves global explanations over TCAV when compared to ground truth, and provides useful insights.
arXiv Detail & Related papers (2021-06-16T09:05:25Z) - On the Objective Evaluation of Post Hoc Explainers [10.981508361941335]
Modern trends in machine learning research have led to algorithms that are increasingly intricate to the degree that they are considered to be black boxes.
In an effort to reduce the opacity of decisions, methods have been proposed to construe the inner workings of such models in a human-comprehensible manner.
We propose a framework for the evaluation of post hoc explainers on ground truth that is directly derived from the additive structure of a model.
arXiv Detail & Related papers (2021-06-15T19:06:51Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.