Learning from Single Timestamps: Complexity Estimation in Laparoscopic Cholecystectomy
- URL: http://arxiv.org/abs/2511.04525v1
- Date: Thu, 06 Nov 2025 16:39:55 GMT
- Title: Learning from Single Timestamps: Complexity Estimation in Laparoscopic Cholecystectomy
- Authors: Dimitrios Anastasiou, Santiago Barbarisi, Lucy Culshaw, Jayna Patel, Evangelos B. Mazomenos, Imanol Luengo, Danail Stoyanov,
- Abstract summary: We introduce STC-Net, a novel framework for SingleTimestamp-based Complexity estimation in Laparoscopic Cholecystectomy (LC) videos.<n>It operates directly on full videos under weak temporal supervision.<n>It achieves an accuracy of 62.11% and an F1-score of 61.42%, outperforming non-localized baselines.
- Score: 8.637329291879162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Purpose: Accurate assessment of surgical complexity is essential in Laparoscopic Cholecystectomy (LC), where severe inflammation is associated with longer operative times and increased risk of postoperative complications. The Parkland Grading Scale (PGS) provides a clinically validated framework for stratifying inflammation severity; however, its automation in surgical videos remains largely unexplored, particularly in realistic scenarios where complete videos must be analyzed without prior manual curation. Methods: In this work, we introduce STC-Net, a novel framework for SingleTimestamp-based Complexity estimation in LC via the PGS, designed to operate under weak temporal supervision. Unlike prior methods limited to static images or manually trimmed clips, STC-Net operates directly on full videos. It jointly performs temporal localization and grading through a localization, window proposal, and grading module. We introduce a novel loss formulation combining hard and soft localization objectives and background-aware grading supervision. Results: Evaluated on a private dataset of 1,859 LC videos, STC-Net achieves an accuracy of 62.11% and an F1-score of 61.42%, outperforming non-localized baselines by over 10% in both metrics and highlighting the effectiveness of weak supervision for surgical complexity assessment. Conclusion: STC-Net demonstrates a scalable and effective approach for automated PGS-based surgical complexity estimation from full LC videos, making it promising for post-operative analysis and surgical training.
Related papers
- Detection-Gated Glottal Segmentation with Zero-Shot Cross-Dataset Transfer and Clinical Feature Extraction [0.0]
We propose a detection-gated pipeline that integrates a YOLOv8-based detector with a U-Net segmenter.<n>The model was trained on a limited subset of the GIRAFE dataset (600 frames) and evaluated via zero-shot transfer on the large-scale BAGLS dataset.
arXiv Detail & Related papers (2026-03-02T17:05:41Z) - Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance [50.486523249499115]
Real-time video understanding is critical to guide procedures in minimally invasive surgery (MIS)<n>We propose Compress-to-Explore (C2E), a novel self-supervised framework to learn compact, informative representations from surgical videos.<n>C2E uses entropy-maximizing decoders to compress images while preserving clinically relevant details, improving encoder performance without labeled data.
arXiv Detail & Related papers (2025-05-16T14:02:24Z) - Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities [65.66373425605278]
Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events.<n>Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases.<n>This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure.
arXiv Detail & Related papers (2025-04-26T15:37:22Z) - Revisiting the Evaluation Bias Introduced by Frame Sampling Strategies in Surgical Video Segmentation Using SAM2 [1.0536099636804035]
We investigate how inconsistencies in annotation density and frame rate sampling influence the evaluation of zero-shot segmentation models.<n>We find that lower frame rates can appear to outperform higher ones due to a smoothing effect that conceals temporal inconsistencies.<n>When assessed under real-time streaming conditions, higher frame rates yield superior segmentation stability.
arXiv Detail & Related papers (2025-02-28T10:42:09Z) - Early Operative Difficulty Assessment in Laparoscopic Cholecystectomy via Snapshot-Centric Video Analysis [3.104121871683839]
We propose the clinical task of early LCOD assessment using limited video observations.<n>We design SurgPrOD, a deep learning model to assess LCOD by analyzing features from global and local temporal resolutions.<n>We introduce the CholeScore dataset, featuring video-level LCOD labels to validate our method.
arXiv Detail & Related papers (2025-02-10T20:14:01Z) - SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba [6.531066045206769]
We present SPRMamba, a novel framework for real-time surgical phase recognition.<n>It integrates a Mamba architecture with a Scaled Residual TranMamba block to synergize temporal modeling and localized detail extraction.<n>It achieves state-of-the-art performance (87.64% accuracy on ESD385, +1.0% over prior methods) demonstrating robust generalizability across surgical procedures.
arXiv Detail & Related papers (2024-09-18T16:26:56Z) - Automated Assessment of Critical View of Safety in Laparoscopic
Cholecystectomy [51.240181118593114]
Cholecystectomy (gallbladder removal) is one of the most common procedures in the US, with more than 1.2M procedures annually.
LC is associated with an increase in bile duct injuries (BDIs), resulting in significant morbidity and mortality.
In this paper, we develop deep-learning techniques to automate the assessment of critical view of safety (CVS) in LCs.
arXiv Detail & Related papers (2023-09-13T22:01:36Z) - Phase-Specific Augmented Reality Guidance for Microscopic Cataract
Surgery Using Long-Short Spatiotemporal Aggregation Transformer [14.568834378003707]
Phaemulsification cataract surgery (PCS) is a routine procedure using a surgical microscope.
PCS guidance systems extract valuable information from surgical microscopic videos to enhance proficiency.
Existing PCS guidance systems suffer from non-phasespecific guidance, leading to redundant visual information.
We propose a novel phase-specific augmented reality (AR) guidance system, which offers tailored AR information corresponding to the recognized surgical phase.
arXiv Detail & Related papers (2023-09-11T02:56:56Z) - LoViT: Long Video Transformer for Surgical Phase Recognition [59.06812739441785]
We present a two-stage method, called Long Video Transformer (LoViT) for fusing short- and long-term temporal information.
Our approach outperforms state-of-the-art methods on the Cholec80 and AutoLaparo datasets consistently.
arXiv Detail & Related papers (2023-05-15T20:06:14Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - LRTD: Long-Range Temporal Dependency based Active Learning for Surgical
Workflow Recognition [67.86810761677403]
We propose a novel active learning method for cost-effective surgical video analysis.
Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency.
We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task.
arXiv Detail & Related papers (2020-04-21T09:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.