Graph-Based Uncertainty-Aware Self-Training with Stochastic Node Labeling
- URL: http://arxiv.org/abs/2503.22745v1
- Date: Wed, 26 Mar 2025 21:54:19 GMT
- Title: Graph-Based Uncertainty-Aware Self-Training with Stochastic Node Labeling
- Authors: Tom Liu, Anna Wu, Chao Li,
- Abstract summary: We propose a novel emphgraph-based uncertainty-aware self-training (GUST) framework to combat over-confidence in node classification.<n>Our method largely diverges from previous self-training approaches by focusing on emphstochastic node labeling grounded in the graph topology.<n> Experimental results on several benchmark graph datasets demonstrate that our GUST framework achieves state-of-the-art performance.
- Score: 2.600103729157093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-training has become a popular semi-supervised learning technique for leveraging unlabeled data. However, the over-confidence of pseudo-labels remains a key challenge. In this paper, we propose a novel \emph{graph-based uncertainty-aware self-training} (GUST) framework to combat over-confidence in node classification. Drawing inspiration from the uncertainty integration idea introduced by Wang \emph{et al.}~\cite{wang2024uncertainty}, our method largely diverges from previous self-training approaches by focusing on \emph{stochastic node labeling} grounded in the graph topology. Specifically, we deploy a Bayesian-inspired module to estimate node-level uncertainty, incorporate these estimates into the pseudo-label generation process via an expectation-maximization (EM)-like step, and iteratively update both node embeddings and adjacency-based transformations. Experimental results on several benchmark graph datasets demonstrate that our GUST framework achieves state-of-the-art performance, especially in settings where labeled data is extremely sparse.
Related papers
- Uncertainty-Aware Graph Self-Training with Expectation-Maximization Regularization [2.743479615751918]
We propose a novel emphuncertainty-aware graph self-training approach for semi-supervised node classification.<n>Our method incorporates an uncertainty mechanism during pseudo-label generation and model retraining.<n>Our framework is designed to handle noisy graph structures and feature spaces more effectively.
arXiv Detail & Related papers (2025-03-26T21:52:21Z) - Uncertainty-aware self-training with expectation maximization basis transformation [9.7527450662978]
We propose a new self-training framework to combine uncertainty information of both model and dataset.
Specifically, we propose to use Expectation-Maximization (EM) to smooth the labels and comprehensively estimate the uncertainty information.
arXiv Detail & Related papers (2024-05-02T11:01:31Z) - Distribution Consistency based Self-Training for Graph Neural Networks
with Sparse Labels [33.89511660654271]
Few-shot node classification poses a significant challenge for Graph Neural Networks (GNNs)
Self-training has emerged as a widely popular framework to leverage the abundance of unlabeled data.
We propose a novel Distribution-Consistent Graph Self-Training framework to identify pseudo-labeled nodes that are both informative and capable of redeeming the distribution discrepancy.
arXiv Detail & Related papers (2024-01-18T22:07:48Z) - CONVERT:Contrastive Graph Clustering with Reliable Augmentation [110.46658439733106]
We propose a novel CONtrastiVe Graph ClustEring network with Reliable AugmenTation (CONVERT)
In our method, the data augmentations are processed by the proposed reversible perturb-recover network.
To further guarantee the reliability of semantics, a novel semantic loss is presented to constrain the network.
arXiv Detail & Related papers (2023-08-17T13:07:09Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Transductive Linear Probing: A Novel Framework for Few-Shot Node
Classification [56.17097897754628]
We show that transductive linear probing with self-supervised graph contrastive pretraining can outperform the state-of-the-art fully supervised meta-learning based methods under the same protocol.
We hope this work can shed new light on few-shot node classification problems and foster future research on learning from scarcely labeled instances on graphs.
arXiv Detail & Related papers (2022-12-11T21:10:34Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - Confidence May Cheat: Self-Training on Graph Neural Networks under
Distribution Shift [39.73304203101909]
Self-training methods have been widely adopted on graphs by labeling high-confidence unlabeled nodes and then adding them to the training step.
We propose a novel Distribution Recovered Graph Self-Training framework (DR- GST), which could recover the distribution of the original labeled dataset.
Both our theoretical analysis and extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed DR- GST.
arXiv Detail & Related papers (2022-01-27T07:12:27Z) - Informative Pseudo-Labeling for Graph Neural Networks with Few Labels [12.83841767562179]
Graph Neural Networks (GNNs) have achieved state-of-the-art results for semi-supervised node classification on graphs.
The challenge of how to effectively learn GNNs with very few labels is still under-explored.
We propose a novel informative pseudo-labeling framework, called InfoGNN, to facilitate learning of GNNs with extremely few labels.
arXiv Detail & Related papers (2022-01-20T01:49:30Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Two-phase Pseudo Label Densification for Self-training based Domain
Adaptation [93.03265290594278]
We propose a novel Two-phase Pseudo Label Densification framework, referred to as TPLD.
In the first phase, we use sliding window voting to propagate the confident predictions, utilizing intrinsic spatial-correlations in the images.
In the second phase, we perform a confidence-based easy-hard classification.
To ease the training process and avoid noisy predictions, we introduce the bootstrapping mechanism to the original self-training loss.
arXiv Detail & Related papers (2020-12-09T02:35:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.