Graph Out-of-Distribution Generalization with Controllable Data
Augmentation
- URL: http://arxiv.org/abs/2308.08344v1
- Date: Wed, 16 Aug 2023 13:10:27 GMT
- Title: Graph Out-of-Distribution Generalization with Controllable Data
Augmentation
- Authors: Bin Lu, Xiaoying Gan, Ze Zhao, Shiyu Liang, Luoyi Fu, Xinbing Wang,
Chenghu Zhou
- Abstract summary: Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties.
Due to the selection bias of training and testing data, distribution deviation is widespread.
We propose OOD calibration to measure the distribution deviation of virtual samples.
- Score: 51.17476258673232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Neural Network (GNN) has demonstrated extraordinary performance in
classifying graph properties. However, due to the selection bias of training
and testing data (e.g., training on small graphs and testing on large graphs,
or training on dense graphs and testing on sparse graphs), distribution
deviation is widespread. More importantly, we often observe \emph{hybrid
structure distribution shift} of both scale and density, despite of one-sided
biased data partition. The spurious correlations over hybrid distribution
deviation degrade the performance of previous GNN methods and show large
instability among different datasets. To alleviate this problem, we propose
\texttt{OOD-GMixup} to jointly manipulate the training distribution with
\emph{controllable data augmentation} in metric space. Specifically, we first
extract the graph rationales to eliminate the spurious correlations due to
irrelevant information. Secondly, we generate virtual samples with perturbation
on graph rationale representation domain to obtain potential OOD training
samples. Finally, we propose OOD calibration to measure the distribution
deviation of virtual samples by leveraging Extreme Value Theory, and further
actively control the training distribution by emphasizing the impact of virtual
OOD samples. Extensive studies on several real-world datasets on graph
classification demonstrate the superiority of our proposed method over
state-of-the-art baselines.
Related papers
- Addressing the Impact of Localized Training Data in Graph Neural
Networks [0.0]
Graph Neural Networks (GNNs) have achieved notable success in learning from graph-structured data.
This article aims to assess the impact of training GNNs on localized subsets of the graph.
We propose a regularization method to minimize distributional discrepancies between localized training data and graph inference.
arXiv Detail & Related papers (2023-07-24T11:04:22Z) - Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph
Representation Learning [30.23894624193583]
Graph Neural Networks (GNNs) training upon non-Euclidean graph data often encounters relatively higher time costs.
We develop a unified data-model dynamic sparsity framework named Graph Decantation (GraphDec) to address challenges brought by training upon a massive class-imbalanced graph data.
arXiv Detail & Related papers (2022-10-01T01:47:00Z) - Graph Condensation via Receptive Field Distribution Matching [61.71711656856704]
This paper focuses on creating a small graph to represent the original graph, so that GNNs trained on the size-reduced graph can make accurate predictions.
We view the original graph as a distribution of receptive fields and aim to synthesize a small graph whose receptive fields share a similar distribution.
arXiv Detail & Related papers (2022-06-28T02:10:05Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - OOD-GNN: Out-of-Distribution Generalized Graph Neural Network [73.67049248445277]
Graph neural networks (GNNs) have achieved impressive performance when testing and training graph data come from identical distribution.
Existing GNNs lack out-of-distribution generalization abilities so that their performance substantially degrades when there exist distribution shifts between testing and training graph data.
We propose an out-of-distribution generalized graph neural network (OOD-GNN) for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs.
arXiv Detail & Related papers (2021-12-07T16:29:10Z) - Graph Classification by Mixture of Diverse Experts [67.33716357951235]
We present GraphDIVE, a framework leveraging mixture of diverse experts for imbalanced graph classification.
With a divide-and-conquer principle, GraphDIVE employs a gating network to partition an imbalanced graph dataset into several subsets.
Experiments on real-world imbalanced graph datasets demonstrate the effectiveness of GraphDIVE.
arXiv Detail & Related papers (2021-03-29T14:03:03Z) - Size-Invariant Graph Representations for Graph Classification
Extrapolations [6.143735952091508]
In general, graph representation learning methods assume that the test and train data come from the same distribution.
Our work shows it is possible to use a causal model to learn approximately invariant representations that better extrapolate between train and test data.
arXiv Detail & Related papers (2021-03-08T20:01:59Z) - Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational
Inference [48.63194907060615]
We build off of semi-implicit graph variational auto-encoders to capture higher-order statistics in a low-dimensional graph latent representation.
We incorporate hyperbolic geometry in the latent space through a Poincare embedding to efficiently represent graphs exhibiting hierarchical structure.
arXiv Detail & Related papers (2020-10-31T05:48:34Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.