Related papers: Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction

Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction

URL: http://arxiv.org/abs/2012.03625v1
Date: Mon, 7 Dec 2020 12:15:47 GMT
Title: Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction
Authors: Andreas Brands{\ae}ter, Ingrid K. Glad
Abstract summary: We propose a novel methodology which we call Shapley values for training data subset importance. We show how the proposed explanations can be used to reveal biasedness in models and erroneous training data. We argue that the explanations enable us to perceive more of the inner workings of the algorithms, and illustrate how models producing similar predictions can be based on very different parts of the training data.
Score: 2.3204178451683264
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There is an increasing interest in and demand for interpretations and explanations of machine learning models and predictions in various application areas. In this paper, we consider data-driven models which are already developed, implemented and trained. Our goal is to interpret the models and explain and understand their predictions. Since the predictions made by data-driven models rely heavily on the data used for training, we believe explanations should convey information about how the training data affects the predictions. To do this, we propose a novel methodology which we call Shapley values for training data subset importance. The Shapley value concept originates from coalitional game theory, developed to fairly distribute the payout among a set of cooperating players. We extend this to subset importance, where a prediction is explained by treating the subsets of the training data as players in a game where the predictions are the payouts. We describe and illustrate how the proposed method can be useful and demonstrate its capabilities on several examples. We show how the proposed explanations can be used to reveal biasedness in models and erroneous training data. Furthermore, we demonstrate that when predictions are accurately explained in a known situation, then explanations of predictions by simple models correspond to the intuitive explanations. We argue that the explanations enable us to perceive more of the inner workings of the algorithms, and illustrate how models producing similar predictions can be based on very different parts of the training data. Finally, we show how we can use Shapley values for subset importance to enhance our training data acquisition, and by this reducing prediction error.

Related papers

Are Data-driven Explanations Robust against Out-of-distribution Data? [18.760475318852375]
We propose an end-to-end model-agnostic learning framework Distributionally Robust Explanations (DRE) Key idea is to fully utilize the inter-distribution information to provide supervisory signals for the learning of explanations without human annotation. Our results demonstrate that the proposed method significantly improves the model's performance in terms of explanation and prediction robustness against distributional shifts.
arXiv Detail & Related papers (2023-03-29T02:02:08Z)
Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. We provide a language for describing how training data influences predictions, through a causal framework. Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z)
Machine Learning in Sports: A Case Study on Using Explainable Models for Predicting Outcomes of Volleyball Matches [0.0]
This paper explores a two-phased Explainable Artificial Intelligence(XAI) approach to predict outcomes of matches in the Brazilian volleyball League (SuperLiga) In the first phase, we directly use the interpretable rule-based ML models that provide a global understanding of the model's behaviors. In the second phase, we construct non-linear models such as Support Vector Machine (SVM) and Deep Neural Network (DNN) to obtain predictive performance on the volleyball matches' outcomes.
arXiv Detail & Related papers (2022-06-18T18:09:15Z)
Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels. Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features. These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z)
Training Deep Models to be Explained with Fewer Examples [40.58343220792933]
We train prediction and explanation models simultaneously with a sparse regularizer for reducing the number of examples. Experiments using several datasets demonstrate that the proposed method improves faithfulness while keeping the predictive performance.
arXiv Detail & Related papers (2021-12-07T05:39:21Z)
Feature Attributions and Counterfactual Explanations Can Be Manipulated [32.579094387004346]
We show how adversaries can design biased models that manipulate model agnostic feature attribution methods. These vulnerabilities allow an adversary to deploy a biased model, yet explanations will not reveal this bias, thereby deceiving stakeholders into trusting the model. We evaluate the manipulations on real world data sets, including COMPAS and Communities & Crime, and find explanations can be manipulated in practice.
arXiv Detail & Related papers (2021-06-23T17:43:31Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models. We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm. Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function. We develop an approach for representation learning in RL that sits in between these two extremes. This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.