Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective
State Spaces
- URL: http://arxiv.org/abs/2402.00789v1
- Date: Thu, 1 Feb 2024 17:21:53 GMT
- Title: Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective
State Spaces
- Authors: Chloe Wang, Oleksii Tsepa, Jun Ma, Bo Wang
- Abstract summary: We introduce Graph-Mamba, the first attempt to enhance long-range context modeling in graph networks.
We formulate graph-centric node prioritization and permutation strategies to enhance context-aware reasoning.
Experiments on ten benchmark datasets demonstrate that Graph-Mamba outperforms state-of-the-art methods in long-range graph prediction tasks.
- Score: 4.928791850200171
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Attention mechanisms have been widely used to capture long-range dependencies
among nodes in Graph Transformers. Bottlenecked by the quadratic computational
cost, attention mechanisms fail to scale in large graphs. Recent improvements
in computational efficiency are mainly achieved by attention sparsification
with random or heuristic-based graph subsampling, which falls short in
data-dependent context reasoning. State space models (SSMs), such as Mamba,
have gained prominence for their effectiveness and efficiency in modeling
long-range dependencies in sequential data. However, adapting SSMs to
non-sequential graph data presents a notable challenge. In this work, we
introduce Graph-Mamba, the first attempt to enhance long-range context modeling
in graph networks by integrating a Mamba block with the input-dependent node
selection mechanism. Specifically, we formulate graph-centric node
prioritization and permutation strategies to enhance context-aware reasoning,
leading to a substantial improvement in predictive performance. Extensive
experiments on ten benchmark datasets demonstrate that Graph-Mamba outperforms
state-of-the-art methods in long-range graph prediction tasks, with a fraction
of the computational cost in both FLOPs and GPU memory consumption. The code
and models are publicly available at https://github.com/bowang-lab/Graph-Mamba.
Related papers
- DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs [59.434893231950205]
Dynamic graph learning aims to uncover evolutionary laws in real-world systems.
We propose DyG-Mamba, a new continuous state space model for dynamic graph learning.
We show that DyG-Mamba achieves state-of-the-art performance on most datasets.
arXiv Detail & Related papers (2024-08-13T15:21:46Z) - What Can We Learn from State Space Models for Machine Learning on Graphs? [11.38076877943004]
We propose Graph State Space Convolution (GSSC) as a principled extension of State Space Models (SSMs) to graph-structured data.
By leveraging global permutation-equivariant set aggregation and factorizable graph kernels, GSSC preserves all three advantages of SSMs.
Our findings highlight the potential of GSSC as a powerful and scalable model for graph machine learning.
arXiv Detail & Related papers (2024-06-09T15:03:36Z) - Sparsity exploitation via discovering graphical models in multi-variate
time-series forecasting [1.2762298148425795]
We propose a decoupled training method, which includes a graph generating module and a GNNs forecasting module.
First, we use Graphical Lasso (or GraphLASSO) to directly exploit the sparsity pattern from data to build graph structures.
Second, we fit these graph structures and the input data into a Graph Convolutional Recurrent Network (GCRN) to train a forecasting model.
arXiv Detail & Related papers (2023-06-29T16:48:00Z) - Distributed Graph Embedding with Information-Oriented Random Walks [16.290803469068145]
Graph embedding maps graph nodes to low-dimensional vectors, and is widely adopted in machine learning tasks.
We present a general-purpose, distributed, information-centric random walk-based graph embedding framework, DistGER, which can scale to embed billion-edge graphs.
D DistGER exhibits 2.33x-129x acceleration, 45% reduction in cross-machines communication, and > 10% effectiveness improvement in downstream tasks.
arXiv Detail & Related papers (2023-03-28T03:11:21Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - GNNAutoScale: Scalable and Expressive Graph Neural Networks via
Historical Embeddings [51.82434518719011]
GNNAutoScale (GAS) is a framework for scaling arbitrary message-passing GNNs to large graphs.
Gas prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations.
Gas reaches state-of-the-art performance on large-scale graphs.
arXiv Detail & Related papers (2021-06-10T09:26:56Z) - Accurate Learning of Graph Representations with Graph Multiset Pooling [45.72542969364438]
We propose a Graph Multiset Transformer (GMT) that captures the interaction between nodes according to their structural dependencies.
Our experimental results show that GMT significantly outperforms state-of-the-art graph pooling methods on graph classification benchmarks.
arXiv Detail & Related papers (2021-02-23T07:45:58Z) - CoSimGNN: Towards Large-scale Graph Similarity Computation [5.17905821006887]
Graph Neural Networks (GNNs) provide a data-driven solution for this task.
Existing GNN-based methods, which either respectively embeds two graphs or deploy cross-graph interactions for whole graph pairs, are still not able to achieve competitive results.
We propose the "embedding-coarsening-matching" framework CoSimGNN, which first embeds and coarsens large graphs with adaptive pooling operation and then deploys fine-grained interactions on the coarsened graphs for final similarity scores.
arXiv Detail & Related papers (2020-05-14T16:33:13Z) - Adaptive Graph Auto-Encoder for General Data Clustering [90.8576971748142]
Graph-based clustering plays an important role in the clustering area.
Recent studies about graph convolution neural networks have achieved impressive success on graph type data.
We propose a graph auto-encoder for general data clustering, which constructs the graph adaptively according to the generative perspective of graphs.
arXiv Detail & Related papers (2020-02-20T10:11:28Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.