Limits of message passing for node classification: How class-bottlenecks restrict signal-to-noise ratio
- URL: http://arxiv.org/abs/2508.17822v1
- Date: Mon, 25 Aug 2025 09:25:14 GMT
- Title: Limits of message passing for node classification: How class-bottlenecks restrict signal-to-noise ratio
- Authors: Jonathan Rubin, Sahil Loomba, Nick S. Jones,
- Abstract summary: Message passing neural networks (MPNNs) are powerful models for node classification but suffer from performance limitations under heterophily and structural bottlenecks in the graph.<n>We provide a unifying statistical framework exposing the relationship between heterophily and bottlenecks through the signal-to-noise ratio (SNR) of MPNN representations.<n>We prove that optimal graph structures for maximising higher-order homophily are disjoint unions of single-class and two-class-bipartite clusters.<n>This yields BRIDGE, a graph ensemble-based rewiring algorithm that achieves near-perfect classification accuracy across all homophily
- Score: 0.6117371161379209
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Message passing neural networks (MPNNs) are powerful models for node classification but suffer from performance limitations under heterophily (low same-class connectivity) and structural bottlenecks in the graph. We provide a unifying statistical framework exposing the relationship between heterophily and bottlenecks through the signal-to-noise ratio (SNR) of MPNN representations. The SNR decomposes model performance into feature-dependent parameters and feature-independent sensitivities. We prove that the sensitivity to class-wise signals is bounded by higher-order homophily -- a generalisation of classical homophily to multi-hop neighbourhoods -- and show that low higher-order homophily manifests locally as the interaction between structural bottlenecks and class labels (class-bottlenecks). Through analysis of graph ensembles, we provide a further quantitative decomposition of bottlenecking into underreaching (lack of depth implying signals cannot arrive) and oversquashing (lack of breadth implying signals arriving on fewer paths) with closed-form expressions. We prove that optimal graph structures for maximising higher-order homophily are disjoint unions of single-class and two-class-bipartite clusters. This yields BRIDGE, a graph ensemble-based rewiring algorithm that achieves near-perfect classification accuracy across all homophily regimes on synthetic benchmarks and significant improvements on real-world benchmarks, by eliminating the ``mid-homophily pitfall'' where MPNNs typically struggle, surpassing current standard rewiring techniques from the literature. Our framework, whose code we make available for public use, provides both diagnostic tools for assessing MPNN performance, and simple yet effective methods for enhancing performance through principled graph modification.
Related papers
- Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression [55.51959317490934]
Large language models (LLMs) have demonstrated promising capabilities in Text-Attributed Graph (TAG) understanding.<n>We argue that graphs inherently contain rich structural and semantic information, and that their effective exploitation can unlock potential gains in LLMs reasoning performance.<n>We propose Homophily-aware Structural and Semantic Compression for LLMs (HS2C), a framework centered on exploiting graph homophily.
arXiv Detail & Related papers (2026-01-13T03:35:18Z) - Beyond Homophily: Community Search on Heterophilic Graphs [9.691955358608537]
Community search aims to identify a refined set of nodes that are most relevant to a given query.<n>Unlike homophilic graphs, many real-world networks are heterophilic, where edges predominantly connect dissimilar nodes.<n>We propose a unified framework featuring three key designs: (i) an AdaptCS that disentangles multi-hop and multi-frequency signals, enabling the model to capture both smooth (homophilic) and contrastive (heterophilic) relations; (ii) a memory-efficient low-rank optimization that removes the main computational bottleneck; and (iii) an Adaptive Community Score (ACS) that guides online search
arXiv Detail & Related papers (2026-01-05T00:44:17Z) - Interpretable and Adaptive Node Classification on Heterophilic Graphs via Combinatorial Scoring and Hybrid Learning [1.2691047660244335]
Graph neural networks (GNNs) achieve strong performance on homophilic graphs but often struggle underphily, where adjacent nodes frequently belong to different classes.<n>We propose an interpretable and adaptive framework for semi-supervised node classification based on explicit inference rather than deep message passing.<n> Experiments on heterophilic and transitional benchmarks demonstrate competitive performance with modern GNNs while offering advantages in interpretability, tuna, and computational efficiency.
arXiv Detail & Related papers (2025-12-22T20:50:44Z) - ReDiSC: A Reparameterized Masked Diffusion Model for Scalable Node Classification with Structured Predictions [64.17845687013434]
We propose ReDiSC, a structured diffusion model for structured node classification.<n>We show that ReDiSC achieves superior or highly competitive performance compared to state-of-the-art GNN, label propagation, and diffusion-based baselines.<n> Notably, ReDiSC scales effectively to large-scale datasets on which previous structured diffusion methods fail due to computational constraints.
arXiv Detail & Related papers (2025-07-19T04:46:53Z) - Graph Structure Refinement with Energy-based Contrastive Learning [56.957793274727514]
We introduce an unsupervised method based on a joint of generative training and discriminative training to learn graph structure and representation.<n>We propose an Energy-based Contrastive Learning (ECL) guided Graph Structure Refinement (GSR) framework, denoted as ECL-GSR.<n>ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efficiency in downstream tasks.
arXiv Detail & Related papers (2024-12-20T04:05:09Z) - Optimality of Message-Passing Architectures for Sparse Graphs [12.42591017155152]
We study the node classification problem on feature-decorated graphs in the sparse setting, i.e., when the expected degree of a node is $O(1)$ in the number of nodes.<n>We introduce a notion of Bayes optimality for node classification tasks, called local Bayes optimality.<n>We show that the optimal message-passing architecture interpolates between a standard in the regime of low graph signal and a typical in the regime of high graph signal.
arXiv Detail & Related papers (2023-05-17T17:31:20Z) - Neighborhood Homophily-based Graph Convolutional Network [4.511171093050241]
Graph neural networks (GNNs) have been proved powerful in graph-oriented tasks.
Many real-world graphs are heterophilous, challenging the homophily assumption of classical GNNs.
Recent studies propose new metrics to characterize the homophily, but rarely consider the correlation of the proposed metrics and models.
In this paper, we first design a new metric, Neighborhood Homophily (textitNH), to measure the label complexity or purity in node neighborhoods.
arXiv Detail & Related papers (2023-01-24T07:56:44Z) - Mixed Graph Contrastive Network for Semi-Supervised Node Classification [63.924129159538076]
We propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN)<n>In our method, we improve the discriminative capability of the latent embeddings by an unperturbed augmentation strategy and a correlation reduction mechanism.<n>By combining the two settings, we extract rich supervision information from both the abundant nodes and the rare yet valuable labeled nodes for discriminative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Simplifying Node Classification on Heterophilous Graphs with Compatible
Label Propagation [6.071760028190454]
We show that a well-known graph algorithm, Label Propagation, combined with a shallow neural network can achieve comparable performance to GNNs in semi-supervised node classification on graphs with high homophily.
In this paper, we show that this approach falls short on graphs with low homophily, where nodes often connect to the nodes of the opposite classes.
Our algorithm first learns the class compatibility matrix and then aggregates label predictions using LP algorithm weighted by class compatibilities.
arXiv Detail & Related papers (2022-05-19T08:34:34Z) - Exploiting Neighbor Effect: Conv-Agnostic GNNs Framework for Graphs with
Heterophily [58.76759997223951]
We propose a new metric based on von Neumann entropy to re-examine the heterophily problem of GNNs.
We also propose a Conv-Agnostic GNN framework (CAGNNs) to enhance the performance of most GNNs on heterophily datasets.
arXiv Detail & Related papers (2022-03-19T14:26:43Z) - Uniting Heterogeneity, Inductiveness, and Efficiency for Graph
Representation Learning [68.97378785686723]
graph neural networks (GNNs) have greatly advanced the performance of node representation learning on graphs.
A majority class of GNNs are only designed for homogeneous graphs, leading to inferior adaptivity to the more informative heterogeneous graphs.
We propose a novel inductive, meta path-free message passing scheme that packs up heterogeneous node features with their associated edges from both low- and high-order neighbor nodes.
arXiv Detail & Related papers (2021-04-04T23:31:39Z) - Beyond Homophily in Graph Neural Networks: Current Limitations and
Effective Designs [28.77753005139331]
We investigate the representation power of graph neural networks in a semi-supervised node classification task under heterophily or low homophily.
Many popular GNNs fail to generalize to this setting, and are even outperformed by models that ignore the graph structure.
We identify a set of key designs that boost learning from the graph structure under heterophily.
arXiv Detail & Related papers (2020-06-20T02:05:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.