Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph
Representation Learning
- URL: http://arxiv.org/abs/2210.00162v1
- Date: Sat, 1 Oct 2022 01:47:00 GMT
- Title: Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph
Representation Learning
- Authors: Chunhui Zhang, Chao Huang, Yijun Tian, Qianlong Wen, Zhongyu Ouyang,
Youhuan Li, Yanfang Ye, Chuxu Zhang
- Abstract summary: Graph Neural Networks (GNNs) training upon non-Euclidean graph data often encounters relatively higher time costs.
We develop a unified data-model dynamic sparsity framework named Graph Decantation (GraphDec) to address challenges brought by training upon a massive class-imbalanced graph data.
- Score: 30.23894624193583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Even pruned by the state-of-the-art network compression methods, Graph Neural
Networks (GNNs) training upon non-Euclidean graph data often encounters
relatively higher time costs, due to its irregular and nasty density
properties, compared with data in the regular Euclidean space. Another natural
property concomitantly with graph is class-imbalance which cannot be alleviated
by the massive graph data while hindering GNNs' generalization. To fully tackle
these unpleasant properties, (i) theoretically, we introduce a hypothesis about
what extent a subset of the training data can approximate the full dataset's
learning effectiveness. The effectiveness is further guaranteed and proved by
the gradients' distance between the subset and the full set; (ii) empirically,
we discover that during the learning process of a GNN, some samples in the
training dataset are informative for providing gradients to update model
parameters. Moreover, the informative subset is not fixed during training
process. Samples that are informative in the current training epoch may not be
so in the next one. We also notice that sparse subnets pruned from a
well-trained GNN sometimes forget the information provided by the informative
subset, reflected in their poor performances upon the subset. Based on these
findings, we develop a unified data-model dynamic sparsity framework named
Graph Decantation (GraphDec) to address challenges brought by training upon a
massive class-imbalanced graph data. The key idea of GraphDec is to identify
the informative subset dynamically during the training process by adopting
sparse graph contrastive learning. Extensive experiments on benchmark datasets
demonstrate that GraphDec outperforms baselines for graph and node tasks, with
respect to classification accuracy and data usage efficiency.
Related papers
- TCGU: Data-centric Graph Unlearning based on Transferable Condensation [36.670771080732486]
Transferable Condensation Graph Unlearning (TCGU) is a data-centric solution to zero-glance graph unlearning.
We show that TCGU can achieve superior performance in terms of model utility, unlearning efficiency, and unlearning efficacy than existing GU methods.
arXiv Detail & Related papers (2024-10-09T02:14:40Z) - Loss-aware Curriculum Learning for Heterogeneous Graph Neural Networks [30.333265803394998]
This paper investigates the application of curriculum learning techniques to improve the performance of Heterogeneous Graph Neural Networks (GNNs)
To better classify the quality of the data, we design a loss-aware training schedule, named LTS, that measures the quality of every nodes of the data.
Our findings demonstrate the efficacy of curriculum learning in enhancing HGNNs capabilities for analyzing complex graph-structured data.
arXiv Detail & Related papers (2024-02-29T05:44:41Z) - GraphGuard: Detecting and Counteracting Training Data Misuse in Graph
Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training.
Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models.
This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z) - Graph Out-of-Distribution Generalization with Controllable Data
Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties.
Due to the selection bias of training and testing data, distribution deviation is widespread.
We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z) - Addressing the Impact of Localized Training Data in Graph Neural
Networks [0.0]
Graph Neural Networks (GNNs) have achieved notable success in learning from graph-structured data.
This article aims to assess the impact of training GNNs on localized subsets of the graph.
We propose a regularization method to minimize distributional discrepancies between localized training data and graph inference.
arXiv Detail & Related papers (2023-07-24T11:04:22Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - OOD-GNN: Out-of-Distribution Generalized Graph Neural Network [73.67049248445277]
Graph neural networks (GNNs) have achieved impressive performance when testing and training graph data come from identical distribution.
Existing GNNs lack out-of-distribution generalization abilities so that their performance substantially degrades when there exist distribution shifts between testing and training graph data.
We propose an out-of-distribution generalized graph neural network (OOD-GNN) for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs.
arXiv Detail & Related papers (2021-12-07T16:29:10Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - Sub-graph Contrast for Scalable Self-Supervised Graph Representation
Learning [21.0019144298605]
Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs.
textscSubg-Con is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information.
Compared with existing graph representation learning approaches, textscSubg-Con has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization.
arXiv Detail & Related papers (2020-09-22T01:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.