Deconfounded Training for Graph Neural Networks
- URL: http://arxiv.org/abs/2112.15089v1
- Date: Thu, 30 Dec 2021 15:22:35 GMT
- Title: Deconfounded Training for Graph Neural Networks
- Authors: Yongduo Sui, Xiang Wang, Jiancan Wu, Xiangnan He, Tat-Seng Chua
- Abstract summary: We present a new paradigm of decon training (DTP) that better mitigates the confounding effect and latches on the critical information.
Specifically, we adopt the attention modules to disentangle the critical subgraph and trivial subgraph.
It allows GNNs to capture a more reliable subgraph whose relation with the label is robust across different distributions.
- Score: 98.06386851685645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning powerful representations is one central theme of graph neural
networks (GNNs). It requires refining the critical information from the input
graph, instead of the trivial patterns, to enrich the representations. Towards
this end, graph attention and pooling methods prevail. They mostly follow the
paradigm of "learning to attend". It maximizes the mutual information between
the attended subgraph and the ground-truth label. However, this training
paradigm is prone to capture the spurious correlations between the trivial
subgraph and the label. Such spurious correlations are beneficial to
in-distribution (ID) test evaluations, but cause poor generalization in the
out-of-distribution (OOD) test data. In this work, we revisit the GNN modeling
from the causal perspective. On the top of our causal assumption, the trivial
information serves as a confounder between the critical information and the
label, which opens a backdoor path between them and makes them spuriously
correlated. Hence, we present a new paradigm of deconfounded training (DTP)
that better mitigates the confounding effect and latches on the critical
information, to enhance the representation and generalization ability.
Specifically, we adopt the attention modules to disentangle the critical
subgraph and trivial subgraph. Then we make each critical subgraph fairly
interact with diverse trivial subgraphs to achieve a stable prediction. It
allows GNNs to capture a more reliable subgraph whose relation with the label
is robust across different distributions. We conduct extensive experiments on
synthetic and real-world datasets to demonstrate the effectiveness.
Related papers
- Graph Partial Label Learning with Potential Cause Discovering [24.659793052786814]
Graph Networks (GNNs) have garnered widespread attention for their potential to address the challenges posed by graph representation learning.
Due to the inherent complexity and interconnectedness of graphs, accurately annotating graph data for training GNNs is extremely challenging.
arXiv Detail & Related papers (2024-03-18T03:56:34Z) - Graph Out-of-Distribution Generalization via Causal Intervention [69.70137479660113]
We introduce a conceptually simple yet principled approach for training robust graph neural networks (GNNs) under node-level distribution shifts.
Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor.
Our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks.
arXiv Detail & Related papers (2024-02-18T07:49:22Z) - Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph
Matching [68.35685422301613]
We propose a novel non-parametric subgraph matching framework, dubbed MatchExplainer, to explore explanatory subgraphs.
It couples the target graph with other counterpart instances and identifies the most crucial joint substructure by minimizing the node corresponding-based distance.
Experiments on synthetic and real-world datasets show the effectiveness of our MatchExplainer by outperforming all state-of-the-art parametric baselines with significant margins.
arXiv Detail & Related papers (2023-01-07T05:14:45Z) - Debiasing Graph Neural Networks via Learning Disentangled Causal
Substructure [46.86463923605841]
We present a graph classification investigation on the training graphs with severe bias.
We discover that GNNs always tend to explore the spurious correlations to make decision.
We propose a general disentangled GNN framework to learn the causal substructure and bias substructure.
arXiv Detail & Related papers (2022-09-28T13:55:52Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - OOD-GNN: Out-of-Distribution Generalized Graph Neural Network [73.67049248445277]
Graph neural networks (GNNs) have achieved impressive performance when testing and training graph data come from identical distribution.
Existing GNNs lack out-of-distribution generalization abilities so that their performance substantially degrades when there exist distribution shifts between testing and training graph data.
We propose an out-of-distribution generalized graph neural network (OOD-GNN) for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs.
arXiv Detail & Related papers (2021-12-07T16:29:10Z) - Generalizing Graph Neural Networks on Out-Of-Distribution Graphs [51.33152272781324]
Graph Neural Networks (GNNs) are proposed without considering the distribution shifts between training and testing graphs.
In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation.
We propose a general causal representation framework, called StableGNN, to eliminate the impact of spurious correlations.
arXiv Detail & Related papers (2021-11-20T18:57:18Z) - Sub-graph Contrast for Scalable Self-Supervised Graph Representation
Learning [21.0019144298605]
Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs.
textscSubg-Con is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information.
Compared with existing graph representation learning approaches, textscSubg-Con has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization.
arXiv Detail & Related papers (2020-09-22T01:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.