Related papers: ATEX-CF: Attack-Informed Counterfactual Explanations for Graph Neural Networks

ATEX-CF: Attack-Informed Counterfactual Explanations for Graph Neural Networks

URL: http://arxiv.org/abs/2602.06240v1
Date: Thu, 05 Feb 2026 22:36:30 GMT
Title: ATEX-CF: Attack-Informed Counterfactual Explanations for Graph Neural Networks
Authors: Yu Zhang, Sean Bin Yang, Arijit Khan, Cuneyt Gurcan Akcora,
Abstract summary: ATEX-CF is a framework that unifies adversarial attack techniques with counterfactual explanation generation.<n>Our method efficiently integrates both edge additions and deletions, grounded in theory, to explore impactful counterfactuals.<n> Experiments on synthetic and real-world node classification benchmarks demonstrate that ATEX-CF generates faithful, concise, and plausible explanations.
Score: 11.482900418658078
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Counterfactual explanations offer an intuitive way to interpret graph neural networks (GNNs) by identifying minimal changes that alter a model's prediction, thereby answering "what must differ for a different outcome?". In this work, we propose a novel framework, ATEX-CF that unifies adversarial attack techniques with counterfactual explanation generation-a connection made feasible by their shared goal of flipping a node's prediction, yet differing in perturbation strategy: adversarial attacks often rely on edge additions, while counterfactual methods typically use deletions. Unlike traditional approaches that treat explanation and attack separately, our method efficiently integrates both edge additions and deletions, grounded in theory, leveraging adversarial insights to explore impactful counterfactuals. In addition, by jointly optimizing fidelity, sparsity, and plausibility under a constrained perturbation budget, our method produces instance-level explanations that are both informative and realistic. Experiments on synthetic and real-world node classification benchmarks demonstrate that ATEX-CF generates faithful, concise, and plausible explanations, highlighting the effectiveness of integrating adversarial insights into counterfactual reasoning for GNNs.

Related papers

Counterfactual Explanations for Hypergraph Neural Networks [2.342443373878122]
Hypergraph neural networks (HGNNs) effectively model higher-order interactions in many real-world systems.<n>We introduce CF-HyperGNNExplainer, a counterfactual explanation method for HGNNs.
arXiv Detail & Related papers (2026-02-04T09:34:03Z)
A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs [54.62268052283014]
We present a unified theoretical perspective based on the framework of signed graphs.<n>We show that many existing strategies implicitly introduce negative edges that alter message-passing to resist oversmoothing.<n>We propose Structural Balanced Propagation (SBP), a plug-and-play method that assigns signed edges based on either labels or feature similarity.
arXiv Detail & Related papers (2025-02-17T03:25:36Z)
HGAttack: Transferable Heterogeneous Graph Adversarial Attack [63.35560741500611]
Heterogeneous Graph Neural Networks (HGNNs) are increasingly recognized for their performance in areas like the web and e-commerce. This paper introduces HGAttack, the first dedicated gray box evasion attack method for heterogeneous graphs.
arXiv Detail & Related papers (2024-01-18T12:47:13Z)
Factorized Explainer for Graph Neural Networks [7.382632811417645]
Graph Neural Networks (GNNs) have received increasing attention due to their ability to learn from graph-structured data. Post-hoc instance-level explanation methods have been proposed to understand GNN predictions. We introduce a novel factorized explanation model with theoretical performance guarantees.
arXiv Detail & Related papers (2023-12-09T15:29:45Z)
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation [6.307753856507624]
Graph Neural Networks (GNNs) have received increasing attention due to their ability to learn from graph-structured data. Post-hoc instance-level explanation methods have been proposed to understand GNN predictions. We shed light on the existence of the distribution shifting issue in existing methods, which affects explanation quality.
arXiv Detail & Related papers (2023-07-15T15:46:38Z)
Resisting Graph Adversarial Attack via Cooperative Homophilous Augmentation [60.50994154879244]
Recent studies show that Graph Neural Networks are vulnerable and easily fooled by small perturbations. In this work, we focus on the emerging but critical attack, namely, Graph Injection Attack. We propose a general defense framework CHAGNN against GIA through cooperative homophilous augmentation of graph data and model.
arXiv Detail & Related papers (2022-11-15T11:44:31Z)
What Does the Gradient Tell When Attacking the Graph Structure [44.44204591087092]
We present a theoretical demonstration revealing that attackers tend to increase inter-class edges due to the message passing mechanism of GNNs. By connecting dissimilar nodes, attackers can more effectively corrupt node features, making such attacks more advantageous. We propose an innovative attack loss that balances attack effectiveness and imperceptibility, sacrificing some attack effectiveness to attain greater imperceptibility.
arXiv Detail & Related papers (2022-08-26T15:45:20Z)
Improved and Interpretable Defense to Transferred Adversarial Examples by Jacobian Norm with Selective Input Gradient Regularization [31.516568778193157]
Adversarial training (AT) is often adopted to improve the robustness of deep neural networks (DNNs) In this work, we propose an approach based on Jacobian norm and Selective Input Gradient Regularization (J- SIGR) Experiments demonstrate that the proposed J- SIGR confers improved robustness against transferred adversarial attacks, and we also show that the predictions from the neural network are easy to interpret.
arXiv Detail & Related papers (2022-07-09T01:06:41Z)
Learning from Attacks: Attacking Variational Autoencoder for Improving Image Classification [17.881134865491063]
Adversarial attacks are often considered as threats to the robustness of Deep Neural Networks (DNNs) This work analyzes adversarial attacks from a different perspective. Namely, adversarial examples contain implicit information that is useful to the predictions. We propose an algorithmic framework that leverages the advantages of the DNNs for data self-expression and task-specific predictions.
arXiv Detail & Related papers (2022-03-11T08:48:26Z)
Meta Adversarial Perturbations [66.43754467275967]
We show the existence of a meta adversarial perturbation (MAP) MAP causes natural images to be misclassified with high probability after being updated through only a one-step gradient ascent update. We show that these perturbations are not only image-agnostic, but also model-agnostic, as a single perturbation generalizes well across unseen data points and different neural network architectures.
arXiv Detail & Related papers (2021-11-19T16:01:45Z)
Jointly Attacking Graph Neural Network and its Explanations [50.231829335996814]
Graph Neural Networks (GNNs) have boosted the performance for many graph-related tasks. Recent studies have shown that GNNs are highly vulnerable to adversarial attacks, where adversaries can mislead the GNNs' prediction by modifying graphs. We propose a novel attack framework (GEAttack) which can attack both a GNN model and its explanations by simultaneously exploiting their vulnerabilities.
arXiv Detail & Related papers (2021-08-07T07:44:33Z)
Transferable Perturbations of Deep Feature Distributions [102.94094966908916]
This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models.
arXiv Detail & Related papers (2020-04-27T00:32:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.