EDDA: Explanation-driven Data Augmentation to Improve Model and
Explanation Alignment
- URL: http://arxiv.org/abs/2105.14162v1
- Date: Sat, 29 May 2021 00:42:42 GMT
- Title: EDDA: Explanation-driven Data Augmentation to Improve Model and
Explanation Alignment
- Authors: Ruiwen Li (co-first author), Zhibo Zhang (co-first author), Jiani Li,
Scott Sanner, Jongseong Jang, Yeonjeong Jeong, Dongsub Shim
- Abstract summary: We seek a methodology that can improve alignment between model predictions and explanation method.
We achieve this through a novel explanation-driven data augmentation (EDDA) method.
This is based on the simple motivating principle that occluding salient regions for the model prediction should decrease the model confidence in the prediction.
- Score: 12.729179495550557
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent years have seen the introduction of a range of methods for post-hoc
explainability of image classifier predictions. However, these post-hoc
explanations may not always align perfectly with classifier predictions, which
poses a significant challenge when attempting to debug models based on such
explanations. To this end, we seek a methodology that can improve alignment
between model predictions and explanation method that is both agnostic to the
model and explanation classes and which does not require ground truth
explanations. We achieve this through a novel explanation-driven data
augmentation (EDDA) method that augments the training data with occlusions of
existing data stemming from model-explanations; this is based on the simple
motivating principle that occluding salient regions for the model prediction
should decrease the model confidence in the prediction, while occluding
non-salient regions should not change the prediction -- if the model and
explainer are aligned. To verify that this augmentation method improves model
and explainer alignment, we evaluate the methodology on a variety of datasets,
image classification models, and explanation methods. We verify in all cases
that our explanation-driven data augmentation method improves alignment of the
model and explanation in comparison to no data augmentation and non-explanation
driven data augmentation methods. In conclusion, this approach provides a novel
model- and explainer-agnostic methodology for improving alignment between model
predictions and explanations, which we see as a critical step forward for
practical deployment and debugging of image classification models.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Training Deep Models to be Explained with Fewer Examples [40.58343220792933]
We train prediction and explanation models simultaneously with a sparse regularizer for reducing the number of examples.
Experiments using several datasets demonstrate that the proposed method improves faithfulness while keeping the predictive performance.
arXiv Detail & Related papers (2021-12-07T05:39:21Z) - Shapley variable importance clouds for interpretable machine learning [2.830197032154301]
We propose a Shapley variable importance cloud that pools information across good models to avoid biased assessments in SHAP analyses of final models.
We demonstrate the additional insights gain compared to conventional explanations and Dong and Rudin's method using criminal justice and electronic medical records data.
arXiv Detail & Related papers (2021-10-06T03:41:04Z) - Information-theoretic Evolution of Model Agnostic Global Explanations [10.921146104622972]
We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data.
Our approach has been deployed in a leading digital marketing suite of products.
arXiv Detail & Related papers (2021-05-14T16:52:16Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Incorporating Causal Graphical Prior Knowledge into Predictive Modeling
via Simple Data Augmentation [92.96204497841032]
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions.
We propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the conditional independence (CI) relations.
We experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
arXiv Detail & Related papers (2021-02-27T06:13:59Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z) - An interpretable neural network model through piecewise linear
approximation [7.196650216279683]
We propose a hybrid interpretable model that combines a piecewise linear component and a nonlinear component.
The first component describes the explicit feature contributions by piecewise linear approximation to increase the expressiveness of the model.
The other component uses a multi-layer perceptron to capture feature interactions and implicit nonlinearity, and increase the prediction performance.
arXiv Detail & Related papers (2020-01-20T14:32:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.