Open-World Semi-Supervised Learning for Node Classification
- URL: http://arxiv.org/abs/2403.11483v1
- Date: Mon, 18 Mar 2024 05:12:54 GMT
- Title: Open-World Semi-Supervised Learning for Node Classification
- Authors: Yanling Wang, Jing Zhang, Lingxi Zhang, Lixin Liu, Yuxiao Dong, Cuiping Li, Hong Chen, Hongzhi Yin,
- Abstract summary: Open-world semi-supervised learning (Open-world SSL) for node classification is a practical but under-explored problem in the graph community.
We propose an IMbalance-Aware method named OpenIMA for Open-world semi-supervised node classification.
- Score: 53.07866559269709
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-world semi-supervised learning (Open-world SSL) for node classification, that classifies unlabeled nodes into seen classes or multiple novel classes, is a practical but under-explored problem in the graph community. As only seen classes have human labels, they are usually better learned than novel classes, and thus exhibit smaller intra-class variances within the embedding space (named as imbalance of intra-class variances between seen and novel classes). Based on empirical and theoretical analysis, we find the variance imbalance can negatively impact the model performance. Pre-trained feature encoders can alleviate this issue via producing compact representations for novel classes. However, creating general pre-trained encoders for various types of graph data has been proven to be challenging. As such, there is a demand for an effective method that does not rely on pre-trained graph encoders. In this paper, we propose an IMbalance-Aware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels. Extensive experiments on seven popular graph benchmarks demonstrate the effectiveness of OpenIMA, and the source code has been available on GitHub.
Related papers
- POWN: Prototypical Open-World Node Classification [6.704529554100875]
We consider the problem of textittrue open-world semi-supervised node classification.
Existing methods detect and reject new classes but fail to distinguish between different new classes.
We introduce a novel end-to-end approach for classification into known classes and new classes based on class prototypes.
arXiv Detail & Related papers (2024-06-14T11:14:01Z) - $\mathcal{G}^2Pxy$: Generative Open-Set Node Classification on Graphs
with Proxy Unknowns [35.976426549671075]
We propose a novel generative open-set node classification method, i.e. $mathcalG2Pxy$.
It follows a stricter inductive learning setting where no information about unknown classes is available during training and validation.
$mathcalG2Pxy$ achieves superior effectiveness for unknown class detection and known class classification.
arXiv Detail & Related papers (2023-08-10T09:42:20Z) - GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node
Classification [64.85392028383164]
Class imbalance is the phenomenon that some classes have much fewer instances than others.
Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples.
We propose a general framework GraphSHA by Synthesizing HArder minor samples.
arXiv Detail & Related papers (2023-06-16T04:05:58Z) - Towards Semi-supervised Universal Graph Classification [6.339931887475018]
We study the problem of semi-supervised universal graph classification.
This problem is challenging due to a severe lack of labels and potential class shifts.
We propose a novel graph neural network framework named UGNN, which makes the best of unlabeled data from the subgraph perspective.
arXiv Detail & Related papers (2023-05-31T06:58:34Z) - Transductive Linear Probing: A Novel Framework for Few-Shot Node
Classification [56.17097897754628]
We show that transductive linear probing with self-supervised graph contrastive pretraining can outperform the state-of-the-art fully supervised meta-learning based methods under the same protocol.
We hope this work can shed new light on few-shot node classification problems and foster future research on learning from scarcely labeled instances on graphs.
arXiv Detail & Related papers (2022-12-11T21:10:34Z) - Geometer: Graph Few-Shot Class-Incremental Learning via Prototype
Representation [50.772432242082914]
Existing graph neural network based methods mainly focus on classifying unlabeled nodes within fixed classes with abundant labeling.
In this paper, we focus on this challenging but practical graph few-shot class-incremental learning (GFSCIL) problem and propose a novel method called Geometer.
Instead of replacing and retraining the fully connected neural network classifer, Geometer predicts the label of a node by finding the nearest class prototype.
arXiv Detail & Related papers (2022-05-27T13:02:07Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - GraphMixup: Improving Class-Imbalanced Node Classification on Graphs by
Self-supervised Context Prediction [25.679620842010422]
This paper presents GraphMixup, a novel mixup-based framework for improving class-imbalanced node classification on graphs.
We develop a emphReinforcement Mixup mechanism to adaptively determine how many samples are to be generated by mixup for those minority classes.
Experiments on three real-world datasets show that GraphMixup yields truly encouraging results for class-imbalanced node classification tasks.
arXiv Detail & Related papers (2021-06-21T14:12:16Z) - Graph Classification by Mixture of Diverse Experts [67.33716357951235]
We present GraphDIVE, a framework leveraging mixture of diverse experts for imbalanced graph classification.
With a divide-and-conquer principle, GraphDIVE employs a gating network to partition an imbalanced graph dataset into several subsets.
Experiments on real-world imbalanced graph datasets demonstrate the effectiveness of GraphDIVE.
arXiv Detail & Related papers (2021-03-29T14:03:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.