GraphSR: A Data Augmentation Algorithm for Imbalanced Node
Classification
- URL: http://arxiv.org/abs/2302.12814v2
- Date: Tue, 27 Jun 2023 14:01:09 GMT
- Title: GraphSR: A Data Augmentation Algorithm for Imbalanced Node
Classification
- Authors: Mengting Zhou and Zhiguo Gong
- Abstract summary: Graph neural networks (GNNs) have achieved great success in node classification tasks.
Existing GNNs naturally bias towards the majority classes with more labelled data and ignore those minority classes with relatively few labelled ones.
We propose textitGraphSR, a novel self-training strategy to augment the minority classes with significant diversity of unlabelled nodes.
- Score: 10.03027886793368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) have achieved great success in node
classification tasks. However, existing GNNs naturally bias towards the
majority classes with more labelled data and ignore those minority classes with
relatively few labelled ones. The traditional techniques often resort
over-sampling methods, but they may cause overfitting problem. More recently,
some works propose to synthesize additional nodes for minority classes from the
labelled nodes, however, there is no any guarantee if those generated nodes
really stand for the corresponding minority classes. In fact, improperly
synthesized nodes may result in insufficient generalization of the algorithm.
To resolve the problem, in this paper we seek to automatically augment the
minority classes from the massive unlabelled nodes of the graph. Specifically,
we propose \textit{GraphSR}, a novel self-training strategy to augment the
minority classes with significant diversity of unlabelled nodes, which is based
on a Similarity-based selection module and a Reinforcement Learning(RL)
selection module. The first module finds a subset of unlabelled nodes which are
most similar to those labelled minority nodes, and the second one further
determines the representative and reliable nodes from the subset via RL
technique. Furthermore, the RL-based module can adaptively determine the
sampling scale according to current training data. This strategy is general and
can be easily combined with different GNNs models. Our experiments demonstrate
the proposed approach outperforms the state-of-the-art baselines on various
class-imbalanced datasets.
Related papers
- Reinforcement Learning for Node Selection in Branch-and-Bound [52.2648997215667]
Current state-of-the-art selectors utilize either hand-crafted ensembles that automatically switch between naive sub-node selectors, or learned node selectors that rely on individual node data.
We propose a novel simulation technique that uses reinforcement learning (RL) while considering the entire tree state, rather than just isolated nodes.
arXiv Detail & Related papers (2023-09-29T19:55:56Z) - Contrastive Meta-Learning for Few-shot Node Classification [54.36506013228169]
Few-shot node classification aims to predict labels for nodes on graphs with only limited labeled nodes as references.
We create a novel contrastive meta-learning framework on graphs, named COSMIC, with two key designs.
arXiv Detail & Related papers (2023-06-27T02:22:45Z) - GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node
Classification [64.85392028383164]
Class imbalance is the phenomenon that some classes have much fewer instances than others.
Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples.
We propose a general framework GraphSHA by Synthesizing HArder minor samples.
arXiv Detail & Related papers (2023-06-16T04:05:58Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - UNREAL:Unlabeled Nodes Retrieval and Labeling for Heavily-imbalanced
Node Classification [17.23736166919287]
skewed label distributions are common in real-world node classification tasks.
In this paper, we propose UNREAL, an iterative over-sampling method.
arXiv Detail & Related papers (2023-03-18T09:23:13Z) - Synthetic Over-sampling for Imbalanced Node Classification with Graph
Neural Networks [34.81248024048974]
Graph neural networks (GNNs) have achieved state-of-the-art performance for node classification.
In many real-world scenarios, node classes are imbalanced, with some majority classes making up most parts of the graph.
In this work, we seek to address this problem by generating pseudo instances of minority classes to balance the training data.
arXiv Detail & Related papers (2022-06-10T19:47:05Z) - GraFN: Semi-Supervised Node Classification on Graph with Few Labels via
Non-Parametric Distribution Assignment [5.879936787990759]
We propose a novel semi-supervised method for graphs, GraFN, to ensure nodes that belong to the same class to be grouped together.
GraFN randomly samples support nodes from labeled nodes and anchor nodes from the entire graph.
We experimentally show that GraFN surpasses both the semi-supervised and self-supervised methods in terms of node classification on real-world graphs.
arXiv Detail & Related papers (2022-04-04T08:22:30Z) - Graph Neural Network with Curriculum Learning for Imbalanced Node
Classification [21.085314408929058]
Graph Neural Network (GNN) is an emerging technique for graph-based learning tasks such as node classification.
In this work, we reveal the vulnerability of GNN to the imbalance of node labels.
We propose a novel graph neural network framework with curriculum learning (GNN-CL) consisting of two modules.
arXiv Detail & Related papers (2022-02-05T10:46:11Z) - GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural
Networks [28.92347073786722]
Graph neural networks (GNNs) have achieved state-of-the-art performance of node classification.
We propose a novel framework, GraphSMOTE, in which an embedding space is constructed to encode the similarity among the nodes.
New samples are synthesize in this space to assure genuineness.
arXiv Detail & Related papers (2021-03-16T03:23:55Z) - Sequential Graph Convolutional Network for Active Learning [53.99104862192055]
We propose a novel pool-based Active Learning framework constructed on a sequential Graph Convolution Network (GCN)
With a small number of randomly sampled images as seed labelled examples, we learn the parameters of the graph to distinguish labelled vs unlabelled nodes.
We exploit these characteristics of GCN to select the unlabelled examples which are sufficiently different from labelled ones.
arXiv Detail & Related papers (2020-06-18T00:55:10Z) - Towards Deeper Graph Neural Networks with Differentiable Group
Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors.
Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases.
We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.