Topology-Imbalance Learning for Semi-Supervised Node Classification
- URL: http://arxiv.org/abs/2110.04099v1
- Date: Fri, 8 Oct 2021 12:57:38 GMT
- Title: Topology-Imbalance Learning for Semi-Supervised Node Classification
- Authors: Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie
Zhou, Xu Sun
- Abstract summary: We argue that graph data expose a unique source of imbalance from the asymmetric topological properties of the labeled nodes.
We devise an influence conflict detection -- based metric Totoro to measure the degree of graph topology imbalance.
We propose a model-agnostic method ReNode to address the topology-imbalance issue.
- Score: 34.964665078512596
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The class imbalance problem, as an important issue in learning node
representations, has drawn increasing attention from the community. Although
the imbalance considered by existing studies roots from the unequal quantity of
labeled examples in different classes (quantity imbalance), we argue that graph
data expose a unique source of imbalance from the asymmetric topological
properties of the labeled nodes, i.e., labeled nodes are not equal in terms of
their structural role in the graph (topology imbalance). In this work, we first
probe the previously unknown topology-imbalance issue, including its
characteristics, causes, and threats to semi-supervised node classification
learning. We then provide a unified view to jointly analyzing the quantity- and
topology- imbalance issues by considering the node influence shift phenomenon
with the Label Propagation algorithm. In light of our analysis, we devise an
influence conflict detection -- based metric Totoro to measure the degree of
graph topology imbalance and propose a model-agnostic method ReNode to address
the topology-imbalance issue by re-weighting the influence of labeled nodes
adaptively based on their relative positions to class boundaries. Systematic
experiments demonstrate the effectiveness and generalizability of our method in
relieving topology-imbalance issue and promoting semi-supervised node
classification. The further analysis unveils varied sensitivity of different
graph neural networks (GNNs) to topology imbalance, which may serve as a new
perspective in evaluating GNN architectures.
Related papers
- Rethinking Semi-Supervised Imbalanced Node Classification from
Bias-Variance Decomposition [18.3055496602884]
This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data.
Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance.
arXiv Detail & Related papers (2023-10-28T17:28:07Z) - Heterophily-Based Graph Neural Network for Imbalanced Classification [19.51668009720269]
We introduce a unique approach that tackles imbalanced classification on graphs by considering graph heterophily.
We propose Fast Im-GBK, which integrates an imbalance classification strategy with heterophily-aware GNNs.
Our experiments on real-world graphs demonstrate our model's superiority in classification performance and efficiency for node classification tasks.
arXiv Detail & Related papers (2023-10-12T21:19:47Z) - Hyperbolic Geometric Graph Representation Learning for
Hierarchy-imbalance Node Classification [30.56321501873245]
We show that training labeled nodes with different hierarchical properties have a significant impact on the node classification tasks.
We propose a novel hyperbolic geometric hierarchy-imbalance learning framework, named HyperIMBA, to alleviate the hierarchy-imbalance issue.
Extensive experimental results demonstrate the superior effectiveness of HyperIMBA for hierarchy-imbalance node classification tasks.
arXiv Detail & Related papers (2023-04-11T08:38:05Z) - Semantic-aware Node Synthesis for Imbalanced Heterogeneous Information
Networks [51.55932524129814]
We present the first method for the semantic imbalance problem in imbalanced HINs named Semantic-aware Node Synthesis (SNS)
SNS adaptively selects the heterogeneous neighbor nodes and augments the network with synthetic nodes while preserving the minority semantics.
We also introduce two regularization approaches for HGNNs that constrain the representation of synthetic nodes from both semantic and class perspectives.
arXiv Detail & Related papers (2023-02-27T00:21:43Z) - Position-aware Structure Learning for Graph Topology-imbalance by
Relieving Under-reaching and Over-squashing [67.83086131278904]
Topology-imbalance is a graph-specific imbalance problem caused by the uneven topology positions of labeled nodes.
We propose a novel position-aware graph structure learning framework named PASTEL.
Our key insight is to enhance the connectivity of nodes within the same class for more supervision information.
arXiv Detail & Related papers (2022-08-17T14:04:21Z) - Generalization Guarantee of Training Graph Convolutional Networks with
Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data.
To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs.
This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - Explicit Pairwise Factorized Graph Neural Network for Semi-Supervised
Node Classification [59.06717774425588]
We propose the Explicit Pairwise Factorized Graph Neural Network (EPFGNN), which models the whole graph as a partially observed Markov Random Field.
It contains explicit pairwise factors to model output-output relations and uses a GNN backbone to model input-output relations.
We conduct experiments on various datasets, which shows that our model can effectively improve the performance for semi-supervised node classification on graphs.
arXiv Detail & Related papers (2021-07-27T19:47:53Z) - GCN-Based Linkage Prediction for Face Clustering on Imbalanced Datasets:
An Empirical Study [5.416933126354173]
We present a new method to alleviate the imbalanced labels and also augment graph representations using a Reverse-Imbalance Weighted Sampling strategy.
The code and a series of imbalanced benchmark datasets are available at https://github.com/espectre/GCNs_on_imbalanced_datasets.
arXiv Detail & Related papers (2021-07-06T08:45:26Z) - Unifying Graph Convolutional Neural Networks and Label Propagation [73.82013612939507]
We study the relationship between LPA and GCN in terms of two aspects: feature/label smoothing and feature/label influence.
Based on our theoretical analysis, we propose an end-to-end model that unifies GCN and LPA for node classification.
Our model can also be seen as learning attention weights based on node labels, which is more task-oriented than existing feature-based attention models.
arXiv Detail & Related papers (2020-02-17T03:23:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.