SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba
- URL: http://arxiv.org/abs/2409.12108v1
- Date: Wed, 18 Sep 2024 16:26:56 GMT
- Title: SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba
- Authors: Xiangning Zhang, Jinnan Chen, Qingwei Zhang, Chengfeng Zhou, Zhengjie Zhang, Xiaobo Li, Dahong Qian,
- Abstract summary: We propose SPRMamba, a novel Mamba-based framework for ESD surgical phase recognition.
We show that SPRMamba surpasses existing state-of-the-art methods and exhibits greater robustness across various surgical phase recognition tasks.
- Score: 4.37495931705689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Endoscopic Submucosal Dissection (ESD) is a minimally invasive procedure initially designed for the treatment of early gastric cancer but is now widely used for various gastrointestinal lesions. Computer-assisted Surgery systems have played a crucial role in improving the precision and safety of ESD procedures, however, their effectiveness is limited by the accurate recognition of surgical phases. The intricate nature of ESD, with different lesion characteristics and tissue structures, presents challenges for real-time surgical phase recognition algorithms. Existing surgical phase recognition algorithms struggle to efficiently capture temporal contexts in video-based scenarios, leading to insufficient performance. To address these issues, we propose SPRMamba, a novel Mamba-based framework for ESD surgical phase recognition. SPRMamba leverages the strengths of Mamba for long-term temporal modeling while introducing the Scaled Residual TranMamba block to enhance the capture of fine-grained details, overcoming the limitations of traditional temporal models like Temporal Convolutional Networks and Transformers. Moreover, a Temporal Sample Strategy is introduced to accelerate the processing, which is essential for real-time phase recognition in clinical settings. Extensive testing on the ESD385 dataset and the cholecystectomy Cholec80 dataset demonstrates that SPRMamba surpasses existing state-of-the-art methods and exhibits greater robustness across various surgical phase recognition tasks.
Related papers
- Topology-based deep-learning segmentation method for deep anterior lamellar keratoplasty (DALK) surgical guidance using M-mode OCT data [0.0]
We develop a topology-based deep-learning segmentation method that integrates a topological loss function with a modified network architecture.
This approach effectively reduces the effects of noise and improves segmentation speed, precision, and stability.
arXiv Detail & Related papers (2025-01-07T19:57:15Z) - Benchmarking and Enhancing Surgical Phase Recognition Models for Robotic-Assisted Esophagectomy [1.0807134580166777]
Robotic-assisted minimally invasive esophagectomy (RAMIE) is a recognized treatment for esophageal cancer.
Our goal is to leverage deep learning for surgical phase recognition in RAMIE to provide intraoperative support to surgeons.
To more effectively capture the temporal dynamics of this complex procedure, we developed a novel deep learning model featuring an encoder-decoder structure with causal hierarchical attention.
arXiv Detail & Related papers (2024-12-05T10:23:16Z) - Deep intra-operative illumination calibration of hyperspectral cameras [73.08443963791343]
Hyperspectral imaging (HSI) is emerging as a promising novel imaging modality with various potential surgical applications.
We show that dynamically changing lighting conditions in the operating room dramatically affect the performance of HSI applications.
We propose a novel learning-based approach to automatically recalibrating hyperspectral images during surgery.
arXiv Detail & Related papers (2024-09-11T08:30:03Z) - Friends Across Time: Multi-Scale Action Segmentation Transformer for
Surgical Phase Recognition [2.10407185597278]
We propose the Multi-Scale Action Transformer (MS-AST) for offline surgical phase recognition and the Multi-Scale Action Causal Transformer (MS-ASCT) for online surgical phase recognition.
Our method can achieve 95.26% and 96.15% accuracy on the Cholec80 dataset for online and offline surgical phase recognition, respectively.
arXiv Detail & Related papers (2024-01-22T01:34:03Z) - Action Recognition in Video Recordings from Gynecologic Laparoscopy [4.002010889177872]
Action recognition is a prerequisite for many applications in laparoscopic video analysis.
In this study, we design and evaluate a CNN-RNN architecture as well as a customized training-inference framework.
arXiv Detail & Related papers (2023-11-30T16:15:46Z) - GLSFormer : Gated - Long, Short Sequence Transformer for Step
Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches.
We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z) - Learning-Based Keypoint Registration for Fetoscopic Mosaicking [65.02392513942533]
In Twin-to-Twin Transfusion Syndrome (TTTS), abnormal vascular anastomoses in the monochorionic placenta can produce uneven blood flow between the two fetuses.
We propose a learning-based framework for in-vivo fetoscopy frame registration for field-of-view expansion.
arXiv Detail & Related papers (2022-07-26T21:21:12Z) - A Long Short-term Memory Based Recurrent Neural Network for
Interventional MRI Reconstruction [50.1787181309337]
We propose a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling.
The proposed algorithm has the potential to achieve real-time i-MRI for DBS and can be used for general purpose MR-guided intervention.
arXiv Detail & Related papers (2022-03-28T14:03:45Z) - Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid
Embedding Aggregation Transformer [57.18185972461453]
We introduce for the first time in surgical workflow analysis Transformer to reconsider the ignored complementary effects of spatial and temporal features for accurate phase recognition.
Our framework is lightweight and processes the hybrid embeddings in parallel to achieve a high inference speed.
arXiv Detail & Related papers (2021-03-17T15:12:55Z) - TeCNO: Surgical Phase Recognition with Multi-Stage Temporal
Convolutional Networks [43.95869213955351]
We propose a Multi-Stage Temporal Convolutional Network (MS-TCN) that performs hierarchical prediction refinement for surgical phase recognition.
Our method is thoroughly evaluated on two datasets of laparoscopic cholecystectomy videos with and without the use of additional surgical tool information.
arXiv Detail & Related papers (2020-03-24T10:12:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.