Related papers: Counterfactual Explanations for Machine Learning: Challenges Revisited

Counterfactual Explanations for Machine Learning: Challenges Revisited

URL: http://arxiv.org/abs/2106.07756v1
Date: Mon, 14 Jun 2021 20:56:37 GMT
Title: Counterfactual Explanations for Machine Learning: Challenges Revisited
Authors: Sahil Verma, John Dickerson, Keegan Hines
Abstract summary: Counterfactual explanations (CFEs) are an emerging technique under the umbrella of interpretability of machine learning (ML) models. They provide what if'' feedback of the form if an input datapoint were $x'$ instead of $x$, then an ML model's output would be $y'$ instead of $y$.
Score: 6.939768185086755
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Counterfactual explanations (CFEs) are an emerging technique under the umbrella of interpretability of machine learning (ML) models. They provide ``what if'' feedback of the form ``if an input datapoint were $x'$ instead of $x$, then an ML model's output would be $y'$ instead of $y$.'' Counterfactual explainability for ML models has yet to see widespread adoption in industry. In this short paper, we posit reasons for this slow uptake. Leveraging recent work outlining desirable properties of CFEs and our experience running the ML wing of a model monitoring startup, we identify outstanding obstacles hindering CFE deployment in industry.

Related papers

Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Monetizing Currency Pair Sentiments through LLM Explainability [2.572906392867547]
Large language models (LLMs) play a vital role in almost every domain in today's organizations. We contribute a novel technique to leverage LLMs as a post-hoc model-independent tool for the explainability of sentiment analysis. We apply our technique in the financial domain for currency-pair price predictions using open news feed data merged with market prices.
arXiv Detail & Related papers (2024-07-29T11:58:54Z)
Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML) VML constrains the parameter space to be human-interpretable natural language. We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z)
Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z)
Explaining black boxes with a SMILE: Statistical Model-agnostic Interpretability with Local Explanations [0.1398098625978622]
One of the major barriers to widespread acceptance of machine learning (ML) is trustworthiness. Most ML models operate as black boxes, their inner workings opaque and mysterious, and it can be difficult to trust their conclusions without understanding how those conclusions are reached. We propose SMILE, a new method that builds on previous approaches by making use of statistical distance measures to improve explainability.
arXiv Detail & Related papers (2023-11-13T12:28:00Z)
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code. At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes. We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z)
Logic-Based Explainability in Machine Learning [0.0]
The operation of the most successful Machine Learning models is incomprehensible for human decision makers. In recent years, there have been efforts on devising approaches for explaining ML models. This paper overviews the ongoing research efforts on computing rigorous model-based explanations of ML models.
arXiv Detail & Related papers (2022-10-24T13:43:07Z)
Reducing Unintended Bias of ML Models on Tabular and Textual Data [5.503546193689538]
We revisit the framework FixOut that is inspired in the approach "fairness through unawareness" to build fairer models. We introduce several improvements such as automating the choice of FixOut's parameters. We present several experimental results that illustrate the fact that FixOut improves process fairness on different classification settings.
arXiv Detail & Related papers (2021-08-05T14:55:56Z)
Amortized Generation of Sequential Counterfactual Explanations for Black-box Models [26.91950709495675]
Counterfactual explanations (CFEs) provide what if'' feedback of a form. Current CFE approaches are single shot -- that is, they assume $x$ can change to $x'$ in a single time period. We propose a novel approach that generates sequential CFEs that allow $x$ to move across intermediate states to a final state $x'$.
arXiv Detail & Related papers (2021-06-07T20:54:48Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
An Information-Theoretic Approach to Personalized Explainable Machine Learning [92.53970625312665]
We propose a simple probabilistic model for the predictions and user knowledge. We quantify the effect of an explanation by the conditional mutual information between the explanation and prediction.
arXiv Detail & Related papers (2020-03-01T13:06:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.