Rethinking Semi-Supervised Imbalanced Node Classification from
Bias-Variance Decomposition
- URL: http://arxiv.org/abs/2310.18765v3
- Date: Mon, 5 Feb 2024 16:37:42 GMT
- Title: Rethinking Semi-Supervised Imbalanced Node Classification from
Bias-Variance Decomposition
- Authors: Divin Yan, Gengchen Wei, Chen Yang, Shengzhong Zhang, Zengfeng Huang
- Abstract summary: This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data.
Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance.
- Score: 18.3055496602884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a new approach to address the issue of class imbalance
in graph neural networks (GNNs) for learning on graph-structured data. Our
approach integrates imbalanced node classification and Bias-Variance
Decomposition, establishing a theoretical framework that closely relates data
imbalance to model variance. We also leverage graph augmentation technique to
estimate the variance, and design a regularization term to alleviate the impact
of imbalance. Exhaustive tests are conducted on multiple benchmarks, including
naturally imbalanced datasets and public-split class-imbalanced datasets,
demonstrating that our approach outperforms state-of-the-art methods in various
imbalanced scenarios. This work provides a novel theoretical perspective for
addressing the problem of imbalanced node classification in GNNs.
Related papers
- Skew Probabilistic Neural Networks for Learning from Imbalanced Data [3.7892198600060945]
This paper introduces an imbalanced data-oriented approach using probabilistic neural networks (PNNs) with a skew normal probability kernel.
We show that SkewPNNs substantially outperform state-of-the-art machine learning methods for both balanced and imbalanced datasets in most experimental settings.
arXiv Detail & Related papers (2023-12-10T13:12:55Z) - Heterophily-Based Graph Neural Network for Imbalanced Classification [19.51668009720269]
We introduce a unique approach that tackles imbalanced classification on graphs by considering graph heterophily.
We propose Fast Im-GBK, which integrates an imbalance classification strategy with heterophily-aware GNNs.
Our experiments on real-world graphs demonstrate our model's superiority in classification performance and efficiency for node classification tasks.
arXiv Detail & Related papers (2023-10-12T21:19:47Z) - Class-Imbalanced Graph Learning without Class Rebalancing [62.1368829847041]
Class imbalance is prevalent in real-world node classification tasks and poses great challenges for graph learning models.
In this work, we approach the root cause of class-imbalance bias from an topological paradigm.
We devise a lightweight topological augmentation framework BAT to mitigate the class-imbalance bias without class rebalancing.
arXiv Detail & Related papers (2023-08-27T19:01:29Z) - Position-aware Structure Learning for Graph Topology-imbalance by
Relieving Under-reaching and Over-squashing [67.83086131278904]
Topology-imbalance is a graph-specific imbalance problem caused by the uneven topology positions of labeled nodes.
We propose a novel position-aware graph structure learning framework named PASTEL.
Our key insight is to enhance the connectivity of nodes within the same class for more supervision information.
arXiv Detail & Related papers (2022-08-17T14:04:21Z) - Phased Progressive Learning with Coupling-Regulation-Imbalance Loss for
Imbalanced Classification [11.673344551762822]
Deep neural networks generally perform poorly with datasets that suffer from quantity imbalance and classification difficulty imbalance between different classes.
A phased progressive learning schedule was proposed for smoothly transferring the training emphasis from representation learning to upper classifier training.
Our code will be open source soon.
arXiv Detail & Related papers (2022-05-24T14:46:39Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - Topology-Imbalance Learning for Semi-Supervised Node Classification [34.964665078512596]
We argue that graph data expose a unique source of imbalance from the asymmetric topological properties of the labeled nodes.
We devise an influence conflict detection -- based metric Totoro to measure the degree of graph topology imbalance.
We propose a model-agnostic method ReNode to address the topology-imbalance issue.
arXiv Detail & Related papers (2021-10-08T12:57:38Z) - GCN-Based Linkage Prediction for Face Clustering on Imbalanced Datasets:
An Empirical Study [5.416933126354173]
We present a new method to alleviate the imbalanced labels and also augment graph representations using a Reverse-Imbalance Weighted Sampling strategy.
The code and a series of imbalanced benchmark datasets are available at https://github.com/espectre/GCNs_on_imbalanced_datasets.
arXiv Detail & Related papers (2021-07-06T08:45:26Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers.
Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.