Related papers: Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark

Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark

URL: http://arxiv.org/abs/2506.12468v2
Date: Tue, 17 Jun 2025 03:17:11 GMT
Title: Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark
Authors: Suyeon Kim, SeongKu Kang, Dongwoo Kim, Jungseul Ok, Hwanjo Yu,
Abstract summary: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification tasks but struggle with label noise in real-world data.<n>We introduce BeGIN, a new benchmark that provides realistic graph datasets with various noise types.<n>By comprehensively evaluating noise-handling strategies, BeGIN provides insights into their effectiveness, efficiency, and key performance factors.
Score: 25.40550479788717
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification tasks but struggle with label noise in real-world data. Existing studies on graph learning with label noise commonly rely on class-dependent label noise, overlooking the complexities of instance-dependent noise and falling short of capturing real-world corruption patterns. We introduce BeGIN (Benchmarking for Graphs with Instance-dependent Noise), a new benchmark that provides realistic graph datasets with various noise types and comprehensively evaluates noise-handling strategies across GNN architectures, noisy label detection, and noise-robust learning. To simulate instance-dependent corruptions, BeGIN introduces algorithmic methods and LLM-based simulations. Our experiments reveal the challenges of instance-dependent noise, particularly LLM-based corruption, and underscore the importance of node-specific parameterization to enhance GNN robustness. By comprehensively evaluating noise-handling strategies, BeGIN provides insights into their effectiveness, efficiency, and key performance factors. We expect that BeGIN will serve as a valuable resource for advancing research on label noise in graphs and fostering the development of robust GNN training methods. The code is available at https://github.com/kimsu55/BeGIN.

Related papers

Learn Beneficial Noise as Graph Augmentation [54.44813218411879]
We propose PiNGDA, where positive-incentive noise (pi-noise) scientifically analyzes the beneficial effect of noise under the information theory.<n>We prove that the standard GCL with pre-defined augmentations is equivalent to estimate the beneficial noise via the point estimation.<n>Since the generator learns how to produce beneficial perturbations on graph topology and node attributes, PiNGDA is more reliable compared with the existing methods.
arXiv Detail & Related papers (2025-05-25T08:20:34Z)
Training Robust Graph Neural Networks by Modeling Noise Dependencies [28.1151026795484]
In real-world applications, node features in graphs often contain noise from various sources, leading to significant performance degradation.<n>We introduce a more realistic noise scenario, dependency-aware noise on graphs (DANG), where noise in node features create a chain of noise dependencies that propagates to the graph structure and node labels.<n>We propose a novel robust GNN, DA-GNN, which captures the causal relationships among variables in the data generating process (DGP) of DANG using variational inference.
arXiv Detail & Related papers (2025-02-27T01:30:13Z)
NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise [21.65452861777135]
Graph Neural Networks (GNNs) exhibit strong potential in node classification task through a message-passing mechanism. Label noise is common in real-world graph data, negatively impacting GNNs by propagating incorrect information during training. We introduce NoisyGL, the first comprehensive benchmark for graph neural networks under label noise.
arXiv Detail & Related papers (2024-06-06T17:45:00Z)
ROG$_{PL}$: Robust Open-Set Graph Learning via Region-Based Prototype Learning [52.60434474638983]
We propose a unified framework named ROG$_PL$ to achieve robust open-set learning on complex noisy graph data. The framework consists of two modules, i.e., denoising via label propagation and open-set prototype learning via regions. To the best of our knowledge, the proposed ROG$_PL$ is the first robust open-set node classification method for graph data with complex noise.
arXiv Detail & Related papers (2024-02-28T17:25:06Z)
ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance [53.73316938815873]
We propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE) to learn representations with error tolerance. ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience. Our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability.
arXiv Detail & Related papers (2023-12-13T17:59:07Z)
Combating Bilateral Edge Noise for Robust Link Prediction [56.43882298843564]
We propose an information-theory-guided principle, Robust Graph Information Bottleneck (RGIB), to extract reliable supervision signals and avoid representation collapse. Two instantiations, RGIB-SSL and RGIB-REP, are explored to leverage the merits of different methodologies. Experiments on six datasets and three GNNs with diverse noisy scenarios verify the effectiveness of our RGIB instantiations.
arXiv Detail & Related papers (2023-11-02T12:47:49Z)
Feature Noise Boosts DNN Generalization under Label Noise [65.36889005555669]
The presence of label noise in the training data has a profound impact on the generalization of deep neural networks (DNNs) In this study, we introduce and theoretically demonstrate a simple feature noise method, which directly adds noise to the features of training data.
arXiv Detail & Related papers (2023-08-03T08:31:31Z)
Robust Training of Graph Neural Networks via Noise Governance [27.767913371777247]
Graph Neural Networks (GNNs) have become widely-used models for semi-supervised learning. In this paper, we consider an important yet challenging scenario where labels on nodes of graphs are not only noisy but also scarce. We propose a novel RTGNN framework that achieves better robustness by learning to explicitly govern label noise.
arXiv Detail & Related papers (2022-11-12T09:25:32Z)
NRGNN: Learning a Label Noise-Resistant Graph Neural Network on Sparsely and Noisily Labeled Graphs [20.470934944907608]
Graph Neural Networks (GNNs) have achieved promising results for semi-supervised learning tasks on graphs such as node classification. Many real-world graphs are often sparsely and noisily labeled, which could significantly degrade the performance of GNNs. We propose to develop a label noise-resistant GNN for semi-supervised node classification.
arXiv Detail & Related papers (2021-06-08T22:12:44Z)
Unified Robust Training for Graph NeuralNetworks against Label Noise [12.014301020294154]
We propose a new framework, UnionNET, for learning with noisy labels on graphs under a semi-supervised setting. Our approach provides a unified solution for robustly training GNNs and performing label correction simultaneously.
arXiv Detail & Related papers (2021-03-05T01:17:04Z)
Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.