Virtual Node Generation for Node Classification in Sparsely-Labeled Graphs
- URL: http://arxiv.org/abs/2409.07712v1
- Date: Thu, 12 Sep 2024 02:36:44 GMT
- Title: Virtual Node Generation for Node Classification in Sparsely-Labeled Graphs
- Authors: Hang Cui, Tarek Abdelzaher,
- Abstract summary: This paper presents a novel node generation method that infuses a small set of high-quality synthesized nodes into the graph as additional labeled nodes.
It is compatible with most popular graph pre-training (self-supervised learning), semi-supervised learning, and meta-learning methods.
Our Experiments demonstrate statistically significant performance improvements over 14 baselines on 10 publicly available datasets.
- Score: 2.0060301665996016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the broader machine learning literature, data-generation methods demonstrate promising results by generating additional informative training examples via augmenting sparse labels. Such methods are less studied in graphs due to the intricate dependencies among nodes in complex topology structures. This paper presents a novel node generation method that infuses a small set of high-quality synthesized nodes into the graph as additional labeled nodes to optimally expand the propagation of labeled information. By simply infusing additional nodes, the framework is orthogonal to the graph learning and downstream classification techniques, and thus is compatible with most popular graph pre-training (self-supervised learning), semi-supervised learning, and meta-learning methods. The contribution lies in designing the generated node set by solving a novel optimization problem. The optimization places the generated nodes in a manner that: (1) minimizes the classification loss to guarantee training accuracy and (2) maximizes label propagation to low-confidence nodes in the downstream task to ensure high-quality propagation. Theoretically, we show that the above dual optimization maximizes the global confidence of node classification. Our Experiments demonstrate statistically significant performance improvements over 14 baselines on 10 publicly available datasets.
Related papers
- GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative Entropy [21.553180564868306]
GraphRARE is a framework built upon node relative entropy and deep reinforcement learning.
An innovative node relative entropy is used to measure mutual information between node pairs.
A deep reinforcement learning-based algorithm is developed to optimize the graph topology.
arXiv Detail & Related papers (2023-12-15T11:30:18Z) - Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors.
We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node
Features [46.052312251801]
We propose a framework for iterating boosting with graph propagation steps.
Our approach is anchored in a principled meta loss function.
Across a variety of non-iid graph datasets, our method achieves comparable or superior performance.
arXiv Detail & Related papers (2021-10-26T04:53:12Z) - Uniting Heterogeneity, Inductiveness, and Efficiency for Graph
Representation Learning [68.97378785686723]
graph neural networks (GNNs) have greatly advanced the performance of node representation learning on graphs.
A majority class of GNNs are only designed for homogeneous graphs, leading to inferior adaptivity to the more informative heterogeneous graphs.
We propose a novel inductive, meta path-free message passing scheme that packs up heterogeneous node features with their associated edges from both low- and high-order neighbor nodes.
arXiv Detail & Related papers (2021-04-04T23:31:39Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - Graph Inference Learning for Semi-supervised Classification [50.55765399527556]
We propose a Graph Inference Learning framework to boost the performance of semi-supervised node classification.
For learning the inference process, we introduce meta-optimization on structure relations from training nodes to validation nodes.
Comprehensive evaluations on four benchmark datasets demonstrate the superiority of our proposed GIL when compared against state-of-the-art methods.
arXiv Detail & Related papers (2020-01-17T02:52:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.