Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
- URL: http://arxiv.org/abs/2501.14679v5
- Date: Thu, 20 Feb 2025 07:37:41 GMT
- Title: Surface Vision Mamba: Leveraging Bidirectional State Space Model for Efficient Spherical Manifold Representation
- Authors: Rongzhao He, Weihao Zheng, Leilei Zhao, Ying Wang, Dalin Zhu, Dan Wu, Bin Hu,
- Abstract summary: We introduce the attention-free Vision Mamba to spherical surfaces.
Our method achieves surface patching by representing spherical data as a sequence of triangular patches.
The proposed Surface Vision Mamba is evaluated on multiple neurodevelopmental phenotype regression tasks.
- Score: 6.550827841703163
- License:
- Abstract: Attention-based methods have demonstrated exceptional performance in modelling long-range dependencies on spherical cortical surfaces, surpassing traditional Geometric Deep Learning (GDL) models. However, their extensive inference time and high memory demands pose challenges for application to large datasets with limited computing resources. Inspired by the state space model in computer vision, we introduce the attention-free Vision Mamba (Vim) to spherical surfaces, presenting a domain-agnostic architecture for analyzing data on spherical manifolds. Our method achieves surface patching by representing spherical data as a sequence of triangular patches derived from a subdivided icosphere. The proposed Surface Vision Mamba (SiM) is evaluated on multiple neurodevelopmental phenotype regression tasks using cortical surface metrics from neonatal brains. Experimental results demonstrate that SiM outperforms both attention- and GDL-based methods, delivering 4.8 times faster inference and achieving 91.7% lower memory consumption compared to the Surface Vision Transformer (SiT) under the Ico-4 grid partitioning. Sensitivity analysis further underscores the potential of SiM to identify subtle cognitive developmental patterns. The code is available at https://github.com/Rongzhao-He/surface-vision-mamba.
Related papers
- Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions.
Our approach uses diffusion models with a novel network architecture to learn surface point distributions.
We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z) - NASM: Neural Anisotropic Surface Meshing [38.8654207201197]
This paper introduces a new learning-based method, NASM, for anisotropic surface meshing.
Key idea is to embed an input mesh into a high-d Euclidean embedding space to preserve curvature-based anisotropic metric.
Then, we propose a novel feature-sensitive remeshing on the generated high-d embedding to automatically capture sharp geometric features.
arXiv Detail & Related papers (2024-10-30T15:20:10Z) - HRVMamba: High-Resolution Visual State Space Model for Dense Prediction [60.80423207808076]
State Space Models (SSMs) with efficient hardware-aware designs have demonstrated significant potential in computer vision tasks.
These models have been constrained by three key challenges: insufficient inductive bias, long-range forgetting, and low-resolution output representation.
We introduce the Dynamic Visual State Space (DVSS) block, which employs deformable convolution to mitigate the long-range forgetting problem.
We also introduce High-Resolution Visual State Space Model (HRVMamba) based on the DVSS block, which preserves high-resolution representations throughout the entire process.
arXiv Detail & Related papers (2024-10-04T06:19:29Z) - Unsupervised Multimodal Surface Registration with Geometric Deep
Learning [3.3403308469369577]
GeoMorph is a novel geometric deep-learning framework designed for image registration of cortical surfaces.
We show that GeoMorph surpasses existing deep-learning methods by achieving improved alignment with smoother deformations.
Such versatility and robustness suggest strong potential for various neuroscience applications.
arXiv Detail & Related papers (2023-11-21T22:05:00Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - The Multiscale Surface Vision Transformer [10.833580445244094]
We introduce the Multiscale Surface Vision Transformer (MS-SiT) as a backbone architecture for surface deep learning.
Results demonstrate that the MS-SiT outperforms existing surface deep learning methods for neonatal phenotyping prediction tasks.
arXiv Detail & Related papers (2023-03-21T15:00:17Z) - HSurf-Net: Normal Estimation for 3D Point Clouds by Learning Hyper
Surfaces [54.77683371400133]
We propose a novel normal estimation method called HSurf-Net, which can accurately predict normals from point clouds with noise and density variations.
Experimental results show that our HSurf-Net achieves the state-of-the-art performance on the synthetic shape dataset.
arXiv Detail & Related papers (2022-10-13T16:39:53Z) - Surface Vision Transformers: Attention-Based Modelling applied to
Cortical Analysis [8.20832544370228]
We introduce a domain-agnostic architecture to study any surface data projected onto a spherical manifold.
A vision transformer model encodes the sequence of patches via successive multi-head self-attention layers.
Experiments show that the SiT generally outperforms surface CNNs, while performing comparably on registered and unregistered data.
arXiv Detail & Related papers (2022-03-30T15:56:11Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Improvising the Learning of Neural Networks on Hyperspherical Manifold [0.0]
The impact of convolution neural networks (CNNs) in the supervised settings provided tremendous increment in performance.
The representation learned from CNN's operated on hyperspherical manifold led to insightful outcomes in face recognition.
A broad range of activation functions is developed with hypersphere intuition which performs superior to softmax in euclidean space.
arXiv Detail & Related papers (2021-09-29T22:39:07Z) - Deep Implicit Surface Point Prediction Networks [49.286550880464866]
Deep neural representations of 3D shapes as implicit functions have been shown to produce high fidelity models.
This paper presents a novel approach that models such surfaces using a new class of implicit representations called the closest surface-point (CSP) representation.
arXiv Detail & Related papers (2021-06-10T14:31:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.