gLSTM: Mitigating Over-Squashing by Increasing Storage Capacity
- URL: http://arxiv.org/abs/2510.08450v2
- Date: Wed, 22 Oct 2025 16:55:32 GMT
- Title: gLSTM: Mitigating Over-Squashing by Increasing Storage Capacity
- Authors: Hugh Blayney, Álvaro Arroyo, Xiaowen Dong, Michael M. Bronstein,
- Abstract summary: We re-examine the over-squashing phenomenon through the lens of model storage and retrieval capacity.<n>We introduce a new synthetic task to demonstrate that an information bottleneck can saturate this capacity.<n>We develop a novel GNN architecture with improved capacity.
- Score: 32.149547459478846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Neural Networks (GNNs) leverage the graph structure to transmit information between nodes, typically through the message-passing mechanism. While these models have found a wide variety of applications, they are known to suffer from over-squashing, where information from a large receptive field of node representations is collapsed into a single fixed sized vector, resulting in an information bottleneck. In this paper, we re-examine the over-squashing phenomenon through the lens of model storage and retrieval capacity, which we define as the amount of information that can be stored in a node's representation for later use. We study some of the limitations of existing tasks used to measure over-squashing and introduce a new synthetic task to demonstrate that an information bottleneck can saturate this capacity. Furthermore, we adapt ideas from the sequence modeling literature on associative memories, fast weight programmers, and the xLSTM model to develop a novel GNN architecture with improved capacity. We demonstrate strong performance of this architecture both on our capacity synthetic task, as well as a range of real-world graph benchmarks.
Related papers
- Efficient Environmental Claim Detection with Hyperbolic Graph Neural Networks [1.7259898169307608]
We explore Graph Neural Networks (GNNs) and Hyperbolic Graph Neural Networks (HGNNs) as lightweight yet effective alternatives to transformer-based models.<n>Our results show that our graph-based models, particularly HGNNs in the poincar'e space (P-HGNNs), achieve performance superior to the state-of-the-art on environmental claim detection.
arXiv Detail & Related papers (2025-02-19T11:04:59Z) - DeltaGNN: Graph Neural Network with Information Flow Control [5.563171090433323]
Graph Neural Networks (GNNs) are designed to process graph-structured data through neighborhood aggregations in the message passing process.<n>Message-passing enables GNNs to understand short-range spatial interactions, but also causes them to suffer from over-smoothing and over-squashing.<n>We propose a mechanism called emph information flow control to address over-smoothing and over-squashing with linear computational overhead.<n>We benchmark our model across 10 real-world datasets, including graphs with varying sizes, topologies, densities, and homophilic ratios, showing superior performance
arXiv Detail & Related papers (2025-01-10T14:34:20Z) - Spatiotemporal Graph Learning with Direct Volumetric Information Passing and Feature Enhancement [62.91536661584656]
We propose a dual-module framework, Cell-embedded and Feature-enhanced Graph Neural Network (aka, CeFeGNN) for learning.<n>We embed learnable cell attributions to the common node-edge message passing process, which better captures the spatial dependency of regional features.<n>Experiments on various PDE systems and one real-world dataset demonstrate that CeFeGNN achieves superior performance compared with other baselines.
arXiv Detail & Related papers (2024-09-26T16:22:08Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - MuseGNN: Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy [27.328628715885287]
A graph neural network (GNN) architecture is capable of modeling data with cross-instance relations.<n>An important subclass of this architecture involves layers designed such that the forward pass iteratively reduces a graph-regularized energy function of interest.<n>We propose a sampling-based energy function and scalable GNN layers that iteratively reduce it, guided by convergence guarantees in certain settings.<n>We also instantiate a full GNN architecture based on these designs, and the model achieves competitive accuracy and scalability when applied to the largest publicly-available node classification benchmark exceeding 1TB in size.
arXiv Detail & Related papers (2023-10-19T04:30:14Z) - HINormer: Representation Learning On Heterogeneous Information Networks
with Graph Transformer [29.217820912610602]
Graph Transformers (GTs) have been proposed which work in the paradigm that allows message passing to a larger coverage even across the whole graph.
The investigation of GTs on heterogeneous information networks (HINs) is still under-exploited.
We propose a novel model named HINormer, which capitalizes on a larger-range aggregation mechanism for node representation learning.
arXiv Detail & Related papers (2023-02-22T12:25:07Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Graph Ordering Attention Networks [22.468776559433614]
Graph Neural Networks (GNNs) have been successfully used in many problems involving graph-structured data.
We introduce the Graph Ordering Attention (GOAT) layer, a novel GNN component that captures interactions between nodes in a neighborhood.
GOAT layer demonstrates its increased performance in modeling graph metrics that capture complex information.
arXiv Detail & Related papers (2022-04-11T18:13:19Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text
Generation [56.73834525802723]
Lightweight Dynamic Graph Convolutional Networks (LDGCNs) are proposed.
LDGCNs capture richer non-local interactions by synthesizing higher order information from the input graphs.
We develop two novel parameter saving strategies based on the group graph convolutions and weight tied convolutions to reduce memory usage and model complexity.
arXiv Detail & Related papers (2020-10-09T06:03:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.