Reliable Graph Neural Network Explanations Through Adversarial Training
- URL: http://arxiv.org/abs/2106.13427v1
- Date: Fri, 25 Jun 2021 04:49:42 GMT
- Title: Reliable Graph Neural Network Explanations Through Adversarial Training
- Authors: Donald Loveland, Shusen Liu, Bhavya Kailkhura, Anna Hiszpanski, Yong
Han
- Abstract summary: Graph neural network (GNN) explanations have largely been facilitated through post-hoc introspection.
We propose a similar training paradigm for GNNs and analyze the respective impact on a model's explanations.
- Score: 10.323055385277877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural network (GNN) explanations have largely been facilitated through
post-hoc introspection. While this has been deemed successful, many post-hoc
explanation methods have been shown to fail in capturing a model's learned
representation. Due to this problem, it is worthwhile to consider how one might
train a model so that it is more amenable to post-hoc analysis. Given the
success of adversarial training in the computer vision domain to train models
with more reliable representations, we propose a similar training paradigm for
GNNs and analyze the respective impact on a model's explanations. In instances
without ground truth labels, we also determine how well an explanation method
is utilizing a model's learned representation through a new metric and
demonstrate adversarial training can help better extract domain-relevant
insights in chemistry.
Related papers
- Globally Interpretable Graph Learning via Distribution Matching [12.885580925389352]
We aim to answer an important question that is not yet well studied: how to provide a global interpretation for the graph learning procedure?
We formulate this problem as globally interpretable graph learning, which targets on distilling high-level and human-intelligible patterns that dominate the learning procedure.
We propose a novel model fidelity metric, tailored for evaluating the fidelity of the resulting model trained on interpretations.
arXiv Detail & Related papers (2023-06-18T00:50:36Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - MEGAN: Multi-Explanation Graph Attention Network [1.1470070927586016]
We propose a multi-explanation graph attention network (MEGAN)
Unlike existing graph explainability methods, our network can produce node and edge attributional explanations along multiple channels.
Our attention-based network is fully differentiable and explanations can actively be trained in an explanation-supervised manner.
arXiv Detail & Related papers (2022-11-23T16:10:13Z) - Causality for Inherently Explainable Transformers: CAT-XPLAIN [16.85887568521622]
We utilize a recently proposed instance-wise post-hoc causal explanation method to make an existing transformer architecture inherently explainable.
Our model provides an explanation in the form of top-$k$ regions in the input space of the given instance contributing to its decision.
arXiv Detail & Related papers (2022-06-29T18:11:01Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Task-Agnostic Graph Explanations [50.17442349253348]
Graph Neural Networks (GNNs) have emerged as powerful tools to encode graph structured data.
Existing learning-based GNN explanation approaches are task-specific in training.
We propose a Task-Agnostic GNN Explainer (TAGE) trained under self-supervision with no knowledge of downstream tasks.
arXiv Detail & Related papers (2022-02-16T21:11:47Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - A Meta-Learning Approach for Training Explainable Graph Neural Networks [10.11960004698409]
We propose a meta-learning framework for improving the level of explainability of a GNN directly at training time.
Our framework jointly trains a model to solve the original task, e.g., node classification, and to provide easily processable outputs for downstream algorithms.
Our model-agnostic approach can improve the explanations produced for different GNN architectures and use any instance-based explainer to drive this process.
arXiv Detail & Related papers (2021-09-20T11:09:10Z) - Unsupervised Detection of Adversarial Examples with Model Explanations [0.6091702876917279]
We propose a simple yet effective method to detect adversarial examples using methods developed to explain the model's behavior.
Our evaluations with MNIST handwritten dataset show that our method is capable of detecting adversarial examples with high confidence.
arXiv Detail & Related papers (2021-07-22T06:54:18Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.