When Do Graph Neural Networks Help with Node Classification?
Investigating the Impact of Homophily Principle on Node Distinguishability
- URL: http://arxiv.org/abs/2304.14274v4
- Date: Mon, 1 Jan 2024 22:49:19 GMT
- Title: When Do Graph Neural Networks Help with Node Classification?
Investigating the Impact of Homophily Principle on Node Distinguishability
- Authors: Sitao Luan, Chenqing Hua, Minkai Xu, Qincheng Lu, Jiaqi Zhu, Xiao-Wen
Chang, Jie Fu, Jure Leskovec, Doina Precup
- Abstract summary: Homophily principle has been believed to be the main reason for the performance superiority of Graph Networks (GNNs) over Neural Networks on node classification tasks.
Recent research suggests that, even in the absence of homophily, the advantage of GNNs still exists as long as nodes from the same class share similar neighborhood patterns.
- Score: 92.8279562472538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Homophily principle, i.e., nodes with the same labels are more likely to be
connected, has been believed to be the main reason for the performance
superiority of Graph Neural Networks (GNNs) over Neural Networks on node
classification tasks. Recent research suggests that, even in the absence of
homophily, the advantage of GNNs still exists as long as nodes from the same
class share similar neighborhood patterns. However, this argument only
considers intra-class Node Distinguishability (ND) but neglects inter-class ND,
which provides incomplete understanding of homophily on GNNs. In this paper, we
first demonstrate such deficiency with examples and argue that an ideal
situation for ND is to have smaller intra-class ND than inter-class ND. To
formulate this idea and study ND deeply, we propose Contextual Stochastic Block
Model for Homophily (CSBM-H) and define two metrics, Probabilistic Bayes Error
(PBE) and negative generalized Jeffreys divergence, to quantify ND. With the
metrics, we visualize and analyze how graph filters, node degree distributions
and class variances influence ND, and investigate the combined effect of intra-
and inter-class ND. Besides, we discovered the mid-homophily pitfall, which
occurs widely in graph datasets. Furthermore, we verified that, in real-work
tasks, the superiority of GNNs is indeed closely related to both intra- and
inter-class ND regardless of homophily levels. Grounded in this observation, we
propose a new hypothesis-testing based performance metric beyond homophily,
which is non-linear, feature-based and can provide statistical threshold value
for GNNs' the superiority. Experiments indicate that it is significantly more
effective than the existing homophily metrics on revealing the advantage and
disadvantage of graph-aware modes on both synthetic and benchmark real-world
datasets.
Related papers
- Mitigating Degree Bias in Signed Graph Neural Networks [5.042342963087923]
Signed Graph Neural Networks (SGNNs) are up against fairness issues from source data and typical aggregation method.
In this paper, we are pioneering to make the investigation of fairness in SGNNs expanded from GNNs.
We identify the issue of degree bias within signed graphs, offering a new perspective on the fairness issues related to SGNNs.
arXiv Detail & Related papers (2024-08-16T03:22:18Z) - The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs [59.03660013787925]
We introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs.
Our observations show that our framework acts as a versatile operator for diverse tasks.
It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth.
arXiv Detail & Related papers (2024-06-18T12:16:00Z) - Breaking the Entanglement of Homophily and Heterophily in
Semi-supervised Node Classification [25.831508778029097]
We introduce AMUD, which quantifies the relationship between node profiles and topology from a statistical perspective.
We also propose ADPA as a new directed graph learning paradigm for AMUD.
arXiv Detail & Related papers (2023-12-07T07:54:11Z) - Revisiting Heterophily For Graph Neural Networks [42.41238892727136]
Graph Neural Networks (GNNs) extend basic Neural Networks (NNs) by using graph structures based on the relational inductive bias (homophily assumption)
Recent work has identified a non-trivial set of datasets where their performance compared to NNs is not satisfactory.
arXiv Detail & Related papers (2022-10-14T08:00:26Z) - Exploiting Neighbor Effect: Conv-Agnostic GNNs Framework for Graphs with
Heterophily [58.76759997223951]
We propose a new metric based on von Neumann entropy to re-examine the heterophily problem of GNNs.
We also propose a Conv-Agnostic GNN framework (CAGNNs) to enhance the performance of most GNNs on heterophily datasets.
arXiv Detail & Related papers (2022-03-19T14:26:43Z) - Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node
Classification? [44.71818395535755]
Graph Neural Networks (GNNs) extend basic Neural Networks (NNs) by using the graph structures based on the inductive bias (homophily assumption)
Performance advantages of GNNs over graph-agnostic NNs seem not generally satisfactory.
Heterophily has been considered as a main cause and numerous works have been put forward to address it.
arXiv Detail & Related papers (2021-09-12T23:57:05Z) - Is Homophily a Necessity for Graph Neural Networks? [50.959340355849896]
Graph neural networks (GNNs) have shown great prowess in learning representations suitable for numerous graph-based machine learning tasks.
GNNs are widely believed to work well due to the homophily assumption ("like attracts like"), and fail to generalize to heterophilous graphs where dissimilar nodes connect.
Recent works design new architectures to overcome such heterophily-related limitations, citing poor baseline performance and new architecture improvements on a few heterophilous graph benchmark datasets as evidence for this notion.
In our experiments, we empirically find that standard graph convolutional networks (GCNs) can actually achieve better performance than
arXiv Detail & Related papers (2021-06-11T02:44:00Z) - NCGNN: Node-level Capsule Graph Neural Network [45.23653314235767]
Node-level Capsule Graph Neural Network (NCGNN) represents nodes as groups of capsules.
novel dynamic routing procedure is developed to adaptively select appropriate capsules for aggregation.
NCGNN can well address the over-smoothing issue and outperforms the state of the arts by producing better node embeddings for classification.
arXiv Detail & Related papers (2020-12-07T06:46:17Z) - Towards Deeper Graph Neural Networks with Differentiable Group
Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors.
Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases.
We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z) - Bilinear Graph Neural Network with Neighbor Interactions [106.80781016591577]
Graph Neural Network (GNN) is a powerful model to learn representations and make predictions on graph data.
We propose a new graph convolution operator, which augments the weighted sum with pairwise interactions of the representations of neighbor nodes.
We term this framework as Bilinear Graph Neural Network (BGNN), which improves GNN representation ability with bilinear interactions between neighbor nodes.
arXiv Detail & Related papers (2020-02-10T06:43:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.