GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural
Networks
- URL: http://arxiv.org/abs/2103.08826v1
- Date: Tue, 16 Mar 2021 03:23:55 GMT
- Title: GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural
Networks
- Authors: Tianxiang Zhao, Xiang Zhang, Suhang Wang
- Abstract summary: Graph neural networks (GNNs) have achieved state-of-the-art performance of node classification.
We propose a novel framework, GraphSMOTE, in which an embedding space is constructed to encode the similarity among the nodes.
New samples are synthesize in this space to assure genuineness.
- Score: 28.92347073786722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Node classification is an important research topic in graph learning. Graph
neural networks (GNNs) have achieved state-of-the-art performance of node
classification. However, existing GNNs address the problem where node samples
for different classes are balanced; while for many real-world scenarios, some
classes may have much fewer instances than others. Directly training a GNN
classifier in this case would under-represent samples from those minority
classes and result in sub-optimal performance. Therefore, it is very important
to develop GNNs for imbalanced node classification. However, the work on this
is rather limited. Hence, we seek to extend previous imbalanced learning
techniques for i.i.d data to the imbalanced node classification task to
facilitate GNN classifiers. In particular, we choose to adopt synthetic
minority over-sampling algorithms, as they are found to be the most effective
and stable. This task is non-trivial, as previous synthetic minority
over-sampling algorithms fail to provide relation information for newly
synthesized samples, which is vital for learning on graphs. Moreover, node
attributes are high-dimensional. Directly over-sampling in the original input
domain could generates out-of-domain samples, which may impair the accuracy of
the classifier. We propose a novel framework, GraphSMOTE, in which an embedding
space is constructed to encode the similarity among the nodes. New samples are
synthesize in this space to assure genuineness. In addition, an edge generator
is trained simultaneously to model the relation information, and provide it for
those new samples. This framework is general and can be easily extended into
different variations. The proposed framework is evaluated using three different
datasets, and it outperforms all baselines with a large margin.
Related papers
- GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks [2.4175455407547015]
Graph neural networks learn to represent nodes by aggregating information from their neighbors.
Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs.
We introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN.
arXiv Detail & Related papers (2023-10-05T09:08:47Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Seq-HGNN: Learning Sequential Node Representation on Heterogeneous Graph [57.2953563124339]
We propose a novel heterogeneous graph neural network with sequential node representation, namely Seq-HGNN.
We conduct extensive experiments on four widely used datasets from Heterogeneous Graph Benchmark (HGB) and Open Graph Benchmark (OGB)
arXiv Detail & Related papers (2023-05-18T07:27:18Z) - GraphSR: A Data Augmentation Algorithm for Imbalanced Node
Classification [10.03027886793368]
Graph neural networks (GNNs) have achieved great success in node classification tasks.
Existing GNNs naturally bias towards the majority classes with more labelled data and ignore those minority classes with relatively few labelled ones.
We propose textitGraphSR, a novel self-training strategy to augment the minority classes with significant diversity of unlabelled nodes.
arXiv Detail & Related papers (2023-02-24T18:49:10Z) - Synthetic Over-sampling for Imbalanced Node Classification with Graph
Neural Networks [34.81248024048974]
Graph neural networks (GNNs) have achieved state-of-the-art performance for node classification.
In many real-world scenarios, node classes are imbalanced, with some majority classes making up most parts of the graph.
In this work, we seek to address this problem by generating pseudo instances of minority classes to balance the training data.
arXiv Detail & Related papers (2022-06-10T19:47:05Z) - Exploiting Neighbor Effect: Conv-Agnostic GNNs Framework for Graphs with
Heterophily [58.76759997223951]
We propose a new metric based on von Neumann entropy to re-examine the heterophily problem of GNNs.
We also propose a Conv-Agnostic GNN framework (CAGNNs) to enhance the performance of most GNNs on heterophily datasets.
arXiv Detail & Related papers (2022-03-19T14:26:43Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - GraphMixup: Improving Class-Imbalanced Node Classification on Graphs by
Self-supervised Context Prediction [25.679620842010422]
This paper presents GraphMixup, a novel mixup-based framework for improving class-imbalanced node classification on graphs.
We develop a emphReinforcement Mixup mechanism to adaptively determine how many samples are to be generated by mixup for those minority classes.
Experiments on three real-world datasets show that GraphMixup yields truly encouraging results for class-imbalanced node classification tasks.
arXiv Detail & Related papers (2021-06-21T14:12:16Z) - Scalable Graph Neural Networks for Heterogeneous Graphs [12.44278942365518]
Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data.
Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks.
In this work, we ask whether these results can be extended to heterogeneous graphs, which encode multiple types of relationship between different entities.
arXiv Detail & Related papers (2020-11-19T06:03:35Z) - Sequential Graph Convolutional Network for Active Learning [53.99104862192055]
We propose a novel pool-based Active Learning framework constructed on a sequential Graph Convolution Network (GCN)
With a small number of randomly sampled images as seed labelled examples, we learn the parameters of the graph to distinguish labelled vs unlabelled nodes.
We exploit these characteristics of GCN to select the unlabelled examples which are sufficiently different from labelled ones.
arXiv Detail & Related papers (2020-06-18T00:55:10Z) - Towards Deeper Graph Neural Networks with Differentiable Group
Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors.
Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases.
We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.