True to the Model or True to the Data?
- URL: http://arxiv.org/abs/2006.16234v1
- Date: Mon, 29 Jun 2020 17:54:39 GMT
- Title: True to the Model or True to the Data?
- Authors: Hugh Chen, Joseph D. Janizek, Scott Lundberg, Su-In Lee
- Abstract summary: We argue that the choice comes down to whether it is desirable to be true to the model or true to the data.
We show how a different choice of value function performs better in each scenario, and how possible attributions are impacted by modeling choices.
- Score: 9.462808515258464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A variety of recent papers discuss the application of Shapley values, a
concept for explaining coalitional games, for feature attribution in machine
learning. However, the correct way to connect a machine learning model to a
coalitional game has been a source of controversy. The two main approaches that
have been proposed differ in the way that they condition on known features,
using either (1) an interventional or (2) an observational conditional
expectation. While previous work has argued that one of the two approaches is
preferable in general, we argue that the choice is application dependent.
Furthermore, we argue that the choice comes down to whether it is desirable to
be true to the model or true to the data. We use linear models to investigate
this choice. After deriving an efficient method for calculating observational
conditional expectation Shapley values for linear models, we investigate how
correlation in simulated data impacts the convergence of observational
conditional expectation Shapley values. Finally, we present two real data
examples that we consider to be representative of possible use cases for
feature attribution -- (1) credit risk modeling and (2) biological discovery.
We show how a different choice of value function performs better in each
scenario, and how possible attributions are impacted by modeling choices.
Related papers
- Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables.
We learn the causal Bayesian network and its confounding latent variables directly from the observational data.
We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z) - Secrets of RLHF in Large Language Models Part II: Reward Modeling [134.97964938009588]
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset.
We also introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses.
arXiv Detail & Related papers (2024-01-11T17:56:59Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - How to select predictive models for causal inference? [0.0]
We show that classic machine-learning model selection does not select the best outcome models for causal inference.
We outline a good causal model-selection procedure: using the so-called $Rtext-risk$; using flexible estimators to compute the nuisance models on the train set.
arXiv Detail & Related papers (2023-02-01T10:58:55Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Sharing pattern submodels for prediction with missing values [12.981974894538668]
Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time.
We propose an alternative approach, called sharing pattern submodels, which i) makes predictions robust to missing values at test time, ii) maintains or improves the predictive power of pattern submodels andiii) has a short description, enabling improved interpretability.
arXiv Detail & Related papers (2022-06-22T15:09:40Z) - Explaining predictive models using Shapley values and non-parametric
vine copulas [2.6774008509840996]
We propose two new approaches for modelling the dependence between the features.
The performance of the proposed methods is evaluated on simulated data sets and a real data set.
Experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.
arXiv Detail & Related papers (2021-02-12T09:43:28Z) - Gaussian Function On Response Surface Estimation [12.35564140065216]
We propose a new framework for interpreting (features and samples) black-box machine learning models via a metamodeling technique.
The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest.
arXiv Detail & Related papers (2021-01-04T04:47:00Z) - A Note on High-Probability versus In-Expectation Guarantees of
Generalization Bounds in Machine Learning [95.48744259567837]
Statistical machine learning theory often tries to give generalization guarantees of machine learning models.
Statements made about the performance of machine learning models have to take the sampling process into account.
We show how one may transform one statement to another.
arXiv Detail & Related papers (2020-10-06T09:41:35Z) - Characterizing and Avoiding Problematic Global Optima of Variational
Autoencoders [28.36260646471421]
Variational Auto-encoders (VAEs) are deep generative latent variable models.
Recent work shows that traditional training methods tend to yield solutions that violate desiderata.
We show that both issues stem from the fact that the global optima of the VAE training objective often correspond to undesirable solutions.
arXiv Detail & Related papers (2020-03-17T15:14:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.