MambaNetLK: Enhancing Colonoscopy Point Cloud Registration with Mamba
- URL: http://arxiv.org/abs/2511.00260v1
- Date: Fri, 31 Oct 2025 21:14:25 GMT
- Title: MambaNetLK: Enhancing Colonoscopy Point Cloud Registration with Mamba
- Authors: Linzhe Jiang, Jiayuan Huang, Sophia Bano, Matthew J. Clarkson, Zhehua Mao, Mobarak I. Hoque,
- Abstract summary: We introduce a novel 3D registration method tailored for endoscopic navigation and a high-quality, clinically grounded dataset.<n>MambaNetLK is a correspondence-free registration framework, which enhances the PointNetLK architecture by integrating a Mamba State Space Model.<n>On the clinical dataset, C3VD-Raycasting-10k, MambaNetLK achieves the best performance compared with the state-of-the-art methods.
- Score: 3.505757483260119
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate 3D point cloud registration underpins reliable image-guided colonoscopy, directly affecting lesion localization, margin assessment, and navigation safety. However, biological tissue exhibits repetitive textures and locally homogeneous geometry that cause feature degeneracy, while substantial domain shifts between pre-operative anatomy and intra-operative observations further degrade alignment stability. To address these clinically critical challenges, we introduce a novel 3D registration method tailored for endoscopic navigation and a high-quality, clinically grounded dataset to support rigorous and reproducible benchmarking. We introduce C3VD-Raycasting-10k, a large-scale benchmark dataset with 10,014 geometrically aligned point cloud pairs derived from clinical CT data. We propose MambaNetLK, a novel correspondence-free registration framework, which enhances the PointNetLK architecture by integrating a Mamba State Space Model (SSM) as a cross-modal feature extractor. As a result, the proposed framework efficiently captures long-range dependencies with linear-time complexity. The alignment is achieved iteratively using the Lucas-Kanade algorithm. On the clinical dataset, C3VD-Raycasting-10k, MambaNetLK achieves the best performance compared with the state-of-the-art methods, reducing median rotation error by 56.04% and RMSE translation error by 26.19% over the second-best method. The model also demonstrates strong generalization on ModelNet40 and superior robustness to initial pose perturbations. MambaNetLK provides a robust foundation for 3D registration in surgical navigation. The combination of a globally expressive SSM-based feature extractor and a large-scale clinical dataset enables more accurate and reliable guidance systems in minimally invasive procedures like colonoscopy.
Related papers
- CT Scans As Video: Efficient Intracranial Hemorrhage Detection Using Multi-Object Tracking [0.9332987715848716]
This paper develops a lightweight computer vision framework that reconciles the efficiency of 2D detection with the necessity of 3D context.<n>By approximating 3D contextual reasoning at a fraction of the computational cost, this method provides a scalable solution for real-time patient prioritization.
arXiv Detail & Related papers (2026-01-05T19:49:51Z) - BronchOpt : Vision-Based Pose Optimization with Fine-Tuned Foundation Models for Accurate Bronchoscopy Navigation [6.915058920280426]
We propose a vision-based pose optimization framework for 2D-3D registration between intra-operative endoscopic views and pre-operative CT anatomy.<n>A fine-tuned modality- and domain-invariant encoder enables direct similarity between real endoscopic RGB frames and CT-rendered depth maps.<n>Our model achieves an average translational error of 2.65 mm and a rotational error of 0.19 rad, demonstrating accurate and stable localization.
arXiv Detail & Related papers (2025-11-12T15:58:05Z) - Cross3DReg: Towards a Large-scale Real-world Cross-source Point Cloud Registration Benchmark [57.42211080221526]
Cross-source point cloud registration, which aims to align point cloud data from different sensors, is a fundamental task in 3D vision.<n>The lack of publicly available large-scale real-world datasets for training the deep registration models, and the inherent differences in point clouds captured by multiple sensors pose challenges.<n>We construct Cross3DReg, the currently largest and real-world multi-modal cross-source point cloud registration dataset.<n>A visual-geometric attention guided matching module is proposed to enhance the consistency of cross-source point cloud features.
arXiv Detail & Related papers (2025-09-08T09:01:13Z) - A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler [49.03919553747297]
We propose an AI-powered, real-time CoW auto-segmentation system capable of efficiently capturing cerebral arteries.<n>No prior studies have explored AI-driven cerebrovascular segmentation using Transcranial Color-coded Doppler (TCCD)<n>The proposed AAW-YOLO demonstrated strong performance in segmenting both ipsilateral and contralateral CoW vessels.
arXiv Detail & Related papers (2025-08-19T14:41:22Z) - DeSamba: Decoupled Spectral Adaptive Framework for 3D Multi-Sequence MRI Lesion Classification [0.6749750044497732]
DeSamba is a framework designed to extract decoupled representations and adaptively fuse spatial and spectral features for lesion classification.<n>DeSamba achieves 62.10% Top-1 accuracy, 63.62% F1-score, 87.71% AUC, and 93.55% Top-3 accuracy on an external validation set.
arXiv Detail & Related papers (2025-07-21T10:42:21Z) - ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation [11.248082139905865]
We propose a hybrid architecture that models MRI sequences as annotated data.<n>Our method uses a deep, preserving pretrained DeepVLab3 backbone to extract high-level semantic features from each MRI slice and a recurrent convolutional head, built with ConvLSTM layers, to integrate information across slices.<n>Compared to state-of-the-art 2D and 3D segmentation models, our approach demonstrates superior performance in terms of precision, recall, Intersection over Union (IoU), Dice Similarity Coefficient (DSC) and robustness.
arXiv Detail & Related papers (2025-06-24T14:56:55Z) - KaLDeX: Kalman Filter based Linear Deformable Cross Attention for Retina Vessel Segmentation [46.57880203321858]
We propose a novel network (KaLDeX) for vascular segmentation leveraging a Kalman filter based linear deformable cross attention (LDCA) module.
Our approach is based on two key components: Kalman filter (KF) based linear deformable convolution (LD) and cross-attention (CA) modules.
The proposed method is evaluated on retinal fundus image datasets (DRIVE, CHASE_BD1, and STARE) as well as the 3mm and 6mm of the OCTA-500 dataset.
arXiv Detail & Related papers (2024-10-28T16:00:42Z) - MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation [15.514511820130474]
We develop a 3D patch-based hybrid CNN-Mamba model for subcortical brain segmentation.
Our model's performance was validated against several benchmarks.
arXiv Detail & Related papers (2024-09-12T02:19:19Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - SCPM-Net: An Anchor-free 3D Lung Nodule Detection Network using Sphere
Representation and Center Points Matching [47.79483848496141]
We propose a 3D sphere representation-based center-points matching detection network (SCPM-Net)
It is anchor-free and automatically predicts the position, radius, and offset of nodules without the manual design of nodule/anchor parameters.
We show that our proposed SCPM-Net framework achieves superior performance compared with existing used anchor-based and anchor-free methods for lung nodule detection.
arXiv Detail & Related papers (2021-04-12T05:51:29Z) - Deep Implicit Statistical Shape Models for 3D Medical Image Delineation [47.78425002879612]
3D delineation of anatomical structures is a cardinal goal in medical imaging analysis.
Prior to deep learning, statistical shape models that imposed anatomical constraints and produced high quality surfaces were a core technology.
We present deep implicit statistical shape models (DISSMs), a new approach to delineation that marries the representation power of CNNs with the robustness of SSMs.
arXiv Detail & Related papers (2021-04-07T01:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.