Explainable Artificial Intelligence: How Subsets of the Training Data
Affect a Prediction
- URL: http://arxiv.org/abs/2012.03625v1
- Date: Mon, 7 Dec 2020 12:15:47 GMT
- Title: Explainable Artificial Intelligence: How Subsets of the Training Data
Affect a Prediction
- Authors: Andreas Brands{\ae}ter, Ingrid K. Glad
- Abstract summary: We propose a novel methodology which we call Shapley values for training data subset importance.
We show how the proposed explanations can be used to reveal biasedness in models and erroneous training data.
We argue that the explanations enable us to perceive more of the inner workings of the algorithms, and illustrate how models producing similar predictions can be based on very different parts of the training data.
- Score: 2.3204178451683264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is an increasing interest in and demand for interpretations and
explanations of machine learning models and predictions in various application
areas. In this paper, we consider data-driven models which are already
developed, implemented and trained. Our goal is to interpret the models and
explain and understand their predictions. Since the predictions made by
data-driven models rely heavily on the data used for training, we believe
explanations should convey information about how the training data affects the
predictions. To do this, we propose a novel methodology which we call Shapley
values for training data subset importance. The Shapley value concept
originates from coalitional game theory, developed to fairly distribute the
payout among a set of cooperating players. We extend this to subset importance,
where a prediction is explained by treating the subsets of the training data as
players in a game where the predictions are the payouts. We describe and
illustrate how the proposed method can be useful and demonstrate its
capabilities on several examples. We show how the proposed explanations can be
used to reveal biasedness in models and erroneous training data. Furthermore,
we demonstrate that when predictions are accurately explained in a known
situation, then explanations of predictions by simple models correspond to the
intuitive explanations. We argue that the explanations enable us to perceive
more of the inner workings of the algorithms, and illustrate how models
producing similar predictions can be based on very different parts of the
training data. Finally, we show how we can use Shapley values for subset
importance to enhance our training data acquisition, and by this reducing
prediction error.
Related papers
- Are Data-driven Explanations Robust against Out-of-distribution Data? [18.760475318852375]
We propose an end-to-end model-agnostic learning framework Distributionally Robust Explanations (DRE)
Key idea is to fully utilize the inter-distribution information to provide supervisory signals for the learning of explanations without human annotation.
Our results demonstrate that the proposed method significantly improves the model's performance in terms of explanation and prediction robustness against distributional shifts.
arXiv Detail & Related papers (2023-03-29T02:02:08Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Machine Learning in Sports: A Case Study on Using Explainable Models for
Predicting Outcomes of Volleyball Matches [0.0]
This paper explores a two-phased Explainable Artificial Intelligence(XAI) approach to predict outcomes of matches in the Brazilian volleyball League (SuperLiga)
In the first phase, we directly use the interpretable rule-based ML models that provide a global understanding of the model's behaviors.
In the second phase, we construct non-linear models such as Support Vector Machine (SVM) and Deep Neural Network (DNN) to obtain predictive performance on the volleyball matches' outcomes.
arXiv Detail & Related papers (2022-06-18T18:09:15Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Training Deep Models to be Explained with Fewer Examples [40.58343220792933]
We train prediction and explanation models simultaneously with a sparse regularizer for reducing the number of examples.
Experiments using several datasets demonstrate that the proposed method improves faithfulness while keeping the predictive performance.
arXiv Detail & Related papers (2021-12-07T05:39:21Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z) - Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function.
We develop an approach for representation learning in RL that sits in between these two extremes.
This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.