Related papers: PointLAMA: Latent Attention meets Mamba for Efficient Point Cloud Pretraining

PointLAMA: Latent Attention meets Mamba for Efficient Point Cloud Pretraining

URL: http://arxiv.org/abs/2507.17296v1
Date: Wed, 23 Jul 2025 07:57:35 GMT
Title: PointLAMA: Latent Attention meets Mamba for Efficient Point Cloud Pretraining
Authors: Xuanyu Lin, Xiaona Zeng, Xianwei Zheng, Xutao Li,
Abstract summary: Mamba has recently gained widespread attention as a backbone model for point cloud modeling, leveraging a state-space architecture that enables efficient global sequence modeling with linear complexity.<n>We propose textbfPointLAMA, a point cloud pretraining framework that combines task-aware point cloud serialization, a hybrid encoder with integrated Latent Attention and Mamba blocks, and a conditional diffusion mechanism built upon the Mamba backbone.
Score: 8.906813021681135
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Mamba has recently gained widespread attention as a backbone model for point cloud modeling, leveraging a state-space architecture that enables efficient global sequence modeling with linear complexity. However, its lack of local inductive bias limits its capacity to capture fine-grained geometric structures in 3D data. To address this limitation, we propose \textbf{PointLAMA}, a point cloud pretraining framework that combines task-aware point cloud serialization, a hybrid encoder with integrated Latent Attention and Mamba blocks, and a conditional diffusion mechanism built upon the Mamba backbone. Specifically, the task-aware point cloud serialization employs Hilbert/Trans-Hilbert space-filling curves and axis-wise sorting to structurally align point tokens for classification and segmentation tasks, respectively. Our lightweight Latent Attention block features a Point-wise Multi-head Latent Attention (PMLA) module, which is specifically designed to align with the Mamba architecture by leveraging the shared latent space characteristics of PMLA and Mamba. This enables enhanced local context modeling while preserving overall efficiency. To further enhance representation learning, we incorporate a conditional diffusion mechanism during pretraining, which denoises perturbed feature sequences without relying on explicit point-wise reconstruction. Experimental results demonstrate that PointLAMA achieves competitive performance on multiple benchmark datasets with minimal parameter count and FLOPs, validating its effectiveness for efficient point cloud pretraining.

Related papers

PMA: Towards Parameter-Efficient Point Cloud Understanding via Point Mamba Adapter [54.33433051500349]
We propose Point Mamba Adapter (PMA), which constructs an ordered feature sequence from all layers of the pre-trained model.<n>We also propose a geometry-constrained gate prompt generator (G2PG) shared across different layers.
arXiv Detail & Related papers (2025-05-27T09:27:16Z)
Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model [9.718016281821471]
Serialized Point Cloud Mamba Model (Serialized Point Mamba) developed. Inspired by the Mamba model's success in natural language processing, we propose the Serialized Point Cloud Mamba Model. Method achieved 76.8 mIoU on Scannet and facilitating 70.3 mIoU on S3DIS.
arXiv Detail & Related papers (2024-07-17T05:26:58Z)
Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy [15.032048930130614]
We propose a novel SSM-based point cloud processing backbone, named Point Mamba, with a causality-aware ordering mechanism. Our method achieves state-of-the-art performance compared with transformer-based counterparts, with 93.4% accuracy and 75.7 mIOU respectively. Our method demonstrates the great potential that SSM can serve as a generic backbone in point cloud understanding.
arXiv Detail & Related papers (2024-03-11T07:07:39Z)
Point Cloud Mamba: Point Cloud Learning via State Space Model [73.7454734756626]
We show that Mamba-based point cloud methods can outperform previous methods based on transformer or multi-layer perceptrons (MLPs) In particular, we demonstrate that Mamba-based point cloud methods can outperform previous methods based on transformer or multi-layer perceptrons (MLPs) Point Cloud Mamba surpasses the state-of-the-art (SOTA) point-based method PointNeXt and achieves new SOTA performance on the ScanNN, ModelNet40, ShapeNetPart, and S3DIS datasets.
arXiv Detail & Related papers (2024-03-01T18:59:03Z)
PointMamba: A Simple State Space Model for Point Cloud Analysis [65.59944745840866]
We propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM), from NLP to point cloud analysis tasks. Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
arXiv Detail & Related papers (2024-02-16T14:56:13Z)
Controllable Mesh Generation Through Sparse Latent Point Diffusion Models [105.83595545314334]
We design a novel sparse latent point diffusion model for mesh generation. Our key insight is to regard point clouds as an intermediate representation of meshes, and model the distribution of point clouds instead. Our proposed sparse latent point diffusion model achieves superior performance in terms of generation quality and controllability.
arXiv Detail & Related papers (2023-03-14T14:25:29Z)
PointPatchMix: Point Cloud Mixing with Patch Scoring [58.58535918705736]
We propose PointPatchMix, which mixes point clouds at the patch level and generates content-based targets for mixed point clouds. Our approach preserves local features at the patch level, while the patch scoring module assigns targets based on the content-based significance score from a pre-trained teacher model. With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86.3% accuracy on ScanObjectNN and 94.1% accuracy on ModelNet40.
arXiv Detail & Related papers (2023-03-12T14:49:42Z)
CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation. We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration. The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.