Causality for Inherently Explainable Transformers: CAT-XPLAIN
- URL: http://arxiv.org/abs/2206.14841v1
- Date: Wed, 29 Jun 2022 18:11:01 GMT
- Title: Causality for Inherently Explainable Transformers: CAT-XPLAIN
- Authors: Subash Khanal, Benjamin Brodie, Xin Xing, Ai-Ling Lin, Nathan Jacobs
- Abstract summary: We utilize a recently proposed instance-wise post-hoc causal explanation method to make an existing transformer architecture inherently explainable.
Our model provides an explanation in the form of top-$k$ regions in the input space of the given instance contributing to its decision.
- Score: 16.85887568521622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There have been several post-hoc explanation approaches developed to explain
pre-trained black-box neural networks. However, there is still a gap in
research efforts toward designing neural networks that are inherently
explainable. In this paper, we utilize a recently proposed instance-wise
post-hoc causal explanation method to make an existing transformer architecture
inherently explainable. Once trained, our model provides an explanation in the
form of top-$k$ regions in the input space of the given instance contributing
to its decision. We evaluate our method on binary classification tasks using
three image datasets: MNIST, FMNIST, and CIFAR. Our results demonstrate that
compared to the causality-based post-hoc explainer model, our inherently
explainable model achieves better explainability results while eliminating the
need of training a separate explainer model. Our code is available at
https://github.com/mvrl/CAT-XPLAIN.
Related papers
- COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable
ELements for explaining neural net classifiers on NLP tasks [3.475906200620518]
COCKATIEL is a novel, post-hoc, concept-based, model-agnostic XAI technique.
It generates meaningful explanations from the last layer of a neural net model trained on an NLP classification task.
It does so without compromising the accuracy of the underlying model or requiring a new one to be trained.
arXiv Detail & Related papers (2023-05-11T12:22:20Z) - ExaRanker: Explanation-Augmented Neural Ranker [67.4894325619275]
In this work, we show that neural rankers also benefit from explanations.
We use LLMs such as GPT-3.5 to augment retrieval datasets with explanations.
Our model, dubbed ExaRanker, finetuned on a few thousand examples with synthetic explanations performs on par with models finetuned on 3x more examples without explanations.
arXiv Detail & Related papers (2023-01-25T11:03:04Z) - MEGAN: Multi-Explanation Graph Attention Network [1.1470070927586016]
We propose a multi-explanation graph attention network (MEGAN)
Unlike existing graph explainability methods, our network can produce node and edge attributional explanations along multiple channels.
Our attention-based network is fully differentiable and explanations can actively be trained in an explanation-supervised manner.
arXiv Detail & Related papers (2022-11-23T16:10:13Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Task-Agnostic Graph Explanations [50.17442349253348]
Graph Neural Networks (GNNs) have emerged as powerful tools to encode graph structured data.
Existing learning-based GNN explanation approaches are task-specific in training.
We propose a Task-Agnostic GNN Explainer (TAGE) trained under self-supervision with no knowledge of downstream tasks.
arXiv Detail & Related papers (2022-02-16T21:11:47Z) - A Meta-Learning Approach for Training Explainable Graph Neural Networks [10.11960004698409]
We propose a meta-learning framework for improving the level of explainability of a GNN directly at training time.
Our framework jointly trains a model to solve the original task, e.g., node classification, and to provide easily processable outputs for downstream algorithms.
Our model-agnostic approach can improve the explanations produced for different GNN architectures and use any instance-based explainer to drive this process.
arXiv Detail & Related papers (2021-09-20T11:09:10Z) - Reliable Graph Neural Network Explanations Through Adversarial Training [10.323055385277877]
Graph neural network (GNN) explanations have largely been facilitated through post-hoc introspection.
We propose a similar training paradigm for GNNs and analyze the respective impact on a model's explanations.
arXiv Detail & Related papers (2021-06-25T04:49:42Z) - Correcting Classification: A Bayesian Framework Using Explanation
Feedback to Improve Classification Abilities [2.0931163605360115]
Explanations are social, meaning they are a transfer of knowledge through interactions.
We overcome these difficulties by training a Bayesian convolutional neural network (CNN) that uses explanation feedback.
Our proposed method utilizes this feedback for fine-tuning to correct the model such that the explanations and classifications improve.
arXiv Detail & Related papers (2021-04-29T13:59:21Z) - Parameterized Explainer for Graph Neural Network [49.79917262156429]
We propose PGExplainer, a parameterized explainer for Graph Neural Networks (GNNs)
Compared to the existing work, PGExplainer has better generalization ability and can be utilized in an inductive setting easily.
Experiments on both synthetic and real-life datasets show highly competitive performance with up to 24.7% relative improvement in AUC on explaining graph classification.
arXiv Detail & Related papers (2020-11-09T17:15:03Z) - Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial
Explanations of Their Behavior in Natural Language? [86.60613602337246]
We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL explanations.
LAS measures how well explanations help an observer predict a model's output, while controlling for how explanations can directly leak the output.
We frame explanation generation as a multi-agent game and optimize explanations for simulatability while penalizing label leakage.
arXiv Detail & Related papers (2020-10-08T16:59:07Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.