Related papers: GNN Explanations that do not Explain and How to find Them

GNN Explanations that do not Explain and How to find Them

URL: http://arxiv.org/abs/2601.20815v2
Date: Fri, 30 Jan 2026 11:29:50 GMT
Title: GNN Explanations that do not Explain and How to find Them
Authors: Steve Azzolin, Stefano Teso, Bruno Lepri, Andrea Passerini, Sagar Malhotra,
Abstract summary: We identify a critical failure of SE-GNN explanations: explanations can be unambiguously unrelated to how the SE-GNNs infer labels.<n>Our empirical analysis reveals that degenerate explanations can be maliciously planted (allowing an attacker to hide the use of sensitive attributes) and can also emerge naturally.<n>To address this, we introduce a novel faithfulness metric that reliably marks degenerate explanations as unfaithful.
Score: 20.68967246188274
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Explanations provided by Self-explainable Graph Neural Networks (SE-GNNs) are fundamental for understanding the model's inner workings and for identifying potential misuse of sensitive attributes. Although recent works have highlighted that these explanations can be suboptimal and potentially misleading, a characterization of their failure cases is unavailable. In this work, we identify a critical failure of SE-GNN explanations: explanations can be unambiguously unrelated to how the SE-GNNs infer labels. We show that, on the one hand, many SE-GNNs can achieve optimal true risk while producing these degenerate explanations, and on the other, most faithfulness metrics can fail to identify these failure modes. Our empirical analysis reveals that degenerate explanations can be maliciously planted (allowing an attacker to hide the use of sensitive attributes) and can also emerge naturally, highlighting the need for reliable auditing. To address this, we introduce a novel faithfulness metric that reliably marks degenerate explanations as unfaithful, in both malicious and natural settings. Our code is available in the supplemental.

Related papers

Beyond Topological Self-Explainable GNNs: A Formal Explainability Perspective [19.270404394350944]
Self-Explainable Graph Neural Networks (SE-GNNs) are popular explainable-by-design GNNs.<n>Our first contribution fills this gap by formalizing the explanations extracted by some popular SE-GNNs.<n>We propose Dual-Channel GNNs that integrate a white-box rule extractor and a standard SE-GNN.
arXiv Detail & Related papers (2025-02-04T21:08:23Z)
Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs [18.33293911039292]
We show that different faithfulness metrics exist, begging the question of what is faithfulness exactly and how to achieve it.<n>We prove that for injective regular GNN architectures, perfectly faithful explanations are completely uninformative.<n>Finally, we show that textitfaithfulness is tightly linked to out-of-distribution generalization.
arXiv Detail & Related papers (2024-06-21T14:01:23Z)
Explainable Graph Neural Networks Under Fire [69.15708723429307]
Graph neural networks (GNNs) usually lack interpretability due to their complex computational behavior and the abstract nature of graphs. Most GNN explanation methods work in a post-hoc manner and provide explanations in the form of a small subset of important edges and/or nodes. In this paper we demonstrate that these explanations can unfortunately not be trusted, as common GNN explanation methods turn out to be highly susceptible to adversarial perturbations.
arXiv Detail & Related papers (2024-06-10T16:09:16Z)
Link Stealing Attacks Against Inductive Graph Neural Networks [60.931106032824275]
A graph neural network (GNN) is a type of neural network that is specifically designed to process graph-structured data. Previous work has shown that transductive GNNs are vulnerable to a series of privacy attacks. This paper conducts a comprehensive privacy analysis of inductive GNNs through the lens of link stealing attacks.
arXiv Detail & Related papers (2024-05-09T14:03:52Z)
Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation [41.831831628421675]
Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection. We propose CFExplainer, a novel counterfactual explainer for GNN-based vulnerability detection.
arXiv Detail & Related papers (2024-04-24T06:52:53Z)
Certified Defense on the Fairness of Graph Neural Networks [86.14235652889242]
Graph Neural Networks (GNNs) have emerged as a prominent graph learning model in various graph-based tasks.<n> malicious attackers could easily corrupt the fairness level of their predictions by adding perturbations to the input graph data.<n>We propose a principled framework named ELEGANT to study a novel problem of certifiable defense on the fairness level of GNNs.
arXiv Detail & Related papers (2023-11-05T20:29:40Z)
Faithful and Consistent Graph Neural Network Explanations with Rationale Alignment [38.66324833510402]
Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. Various algorithms are proposed, most of them formalize this task by searching the minimal subgraph which can preserve original predictions. Several subgraphs can result in the same or similar outputs as the original graphs. Applying them to explain weakly-performed GNNs would further amplify these issues.
arXiv Detail & Related papers (2023-01-07T06:33:35Z)
On Consistency in Graph Neural Network Interpretation [34.25952902469481]
Instance-level GNN explanation aims to discover critical input elements, like nodes or edges, that the target GNN relies upon for making predictions. Various algorithms are proposed, but most of them formalize this task by searching the minimal subgraph. We propose a simple yet effective countermeasure by aligning embeddings.
arXiv Detail & Related papers (2022-05-27T02:58:07Z)
Task-Agnostic Graph Explanations [50.17442349253348]
Graph Neural Networks (GNNs) have emerged as powerful tools to encode graph structured data. Existing learning-based GNN explanation approaches are task-specific in training. We propose a Task-Agnostic GNN Explainer (TAGE) trained under self-supervision with no knowledge of downstream tasks.
arXiv Detail & Related papers (2022-02-16T21:11:47Z)
Jointly Attacking Graph Neural Network and its Explanations [50.231829335996814]
Graph Neural Networks (GNNs) have boosted the performance for many graph-related tasks. Recent studies have shown that GNNs are highly vulnerable to adversarial attacks, where adversaries can mislead the GNNs' prediction by modifying graphs. We propose a novel attack framework (GEAttack) which can attack both a GNN model and its explanations by simultaneously exploiting their vulnerabilities.
arXiv Detail & Related papers (2021-08-07T07:44:33Z)
Parameterized Explainer for Graph Neural Network [49.79917262156429]
We propose PGExplainer, a parameterized explainer for Graph Neural Networks (GNNs) Compared to the existing work, PGExplainer has better generalization ability and can be utilized in an inductive setting easily. Experiments on both synthetic and real-life datasets show highly competitive performance with up to 24.7% relative improvement in AUC on explaining graph classification.
arXiv Detail & Related papers (2020-11-09T17:15:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.