Explaining black boxes with a SMILE: Statistical Model-agnostic
Interpretability with Local Explanations
- URL: http://arxiv.org/abs/2311.07286v1
- Date: Mon, 13 Nov 2023 12:28:00 GMT
- Title: Explaining black boxes with a SMILE: Statistical Model-agnostic
Interpretability with Local Explanations
- Authors: Koorosh Aslansefat, Mojgan Hashemian, Martin Walker, Mohammed Naveed
Akram, Ioannis Sorokos, Yiannis Papadopoulos
- Abstract summary: One of the major barriers to widespread acceptance of machine learning (ML) is trustworthiness.
Most ML models operate as black boxes, their inner workings opaque and mysterious, and it can be difficult to trust their conclusions without understanding how those conclusions are reached.
We propose SMILE, a new method that builds on previous approaches by making use of statistical distance measures to improve explainability.
- Score: 0.1398098625978622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning is currently undergoing an explosion in capability,
popularity, and sophistication. However, one of the major barriers to
widespread acceptance of machine learning (ML) is trustworthiness: most ML
models operate as black boxes, their inner workings opaque and mysterious, and
it can be difficult to trust their conclusions without understanding how those
conclusions are reached. Explainability is therefore a key aspect of improving
trustworthiness: the ability to better understand, interpret, and anticipate
the behaviour of ML models. To this end, we propose SMILE, a new method that
builds on previous approaches by making use of statistical distance measures to
improve explainability while remaining applicable to a wide range of input data
domains.
Related papers
- Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Accountable and Explainable Methods for Complex Reasoning over Text [5.571369922847262]
Accountability and transparency of Machine Learning models have been posed as critical desiderata by works in policy and law, philosophy, and computer science.
This thesis expands our collective knowledge in the areas of accountability and transparency of ML models developed for complex reasoning tasks over text.
arXiv Detail & Related papers (2022-11-09T15:14:52Z) - Shapelet-Based Counterfactual Explanations for Multivariate Time Series [0.9990687944474738]
We develop a model agnostic multivariate time series (MTS) counterfactual explanation algorithm.
We test our approach on a real-life solar flare prediction dataset and prove that our approach produces high-quality counterfactuals.
In addition to being visually interpretable, our explanations are superior in terms of proximity, sparsity, and plausibility.
arXiv Detail & Related papers (2022-08-22T17:33:31Z) - Interpretation of Black Box NLP Models: A Survey [0.0]
Post hoc explanations based on perturbations are widely used approaches to interpret a machine learning model after it has been built.
We propose S-LIME, which utilizes a hypothesis testing framework based on central limit theorem for determining the number of perturbation points needed to guarantee stability of the resulting explanation.
arXiv Detail & Related papers (2022-03-31T14:54:35Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - S-LIME: Stabilized-LIME for Model Explanation [7.479279851480736]
Post hoc explanations based on perturbations are widely used approaches to interpret a machine learning model after it has been built.
We propose S-LIME, which utilizes a hypothesis testing framework based on central limit theorem for determining the number of perturbation points needed to guarantee stability of the resulting explanation.
arXiv Detail & Related papers (2021-06-15T04:24:59Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.