Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation
- URL: http://arxiv.org/abs/2508.01713v1
- Date: Sun, 03 Aug 2025 10:47:01 GMT
- Title: Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation
- Authors: Julia Hindel, Ema Mekic, Enamundram Naga Karthik, Rohit Mohan, Daniele Cattaneo, Maria Kalweit, Abhinav Valada,
- Abstract summary: Class-incremental semantic segmentation (CISS) allows models to continually adapt to new classes without training on previous data.<n>We introduce a refined set of labels with more than 144 classes on the Syn-Mediverse synthetic dataset, hosted online as an evaluation benchmark.
- Score: 11.59416791598718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robot-assisted surgeries rely on accurate and real-time scene understanding to safely guide surgical instruments. However, segmentation models trained on static datasets face key limitations when deployed in these dynamic and evolving surgical environments. Class-incremental semantic segmentation (CISS) allows models to continually adapt to new classes while avoiding catastrophic forgetting of prior knowledge, without training on previous data. In this work, we build upon the recently introduced Taxonomy-Oriented Poincar\'e-regularized Incremental Class Segmentation (TOPICS) approach and propose an enhanced variant, termed TOPICS+, specifically tailored for robust segmentation of surgical scenes. Concretely, we incorporate the Dice loss into the hierarchical loss formulation to handle strong class imbalances, introduce hierarchical pseudo-labeling, and design tailored label taxonomies for robotic surgery environments. We also propose six novel CISS benchmarks designed for robotic surgery environments including multiple incremental steps and several semantic categories to emulate realistic class-incremental settings in surgical environments. In addition, we introduce a refined set of labels with more than 144 classes on the Syn-Mediverse synthetic dataset, hosted online as an evaluation benchmark. We make the code and trained models publicly available at http://topics.cs.uni-freiburg.de.
Related papers
- Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance [50.486523249499115]
Real-time video understanding is critical to guide procedures in minimally invasive surgery (MIS)<n>We propose Compress-to-Explore (C2E), a novel self-supervised framework to learn compact, informative representations from surgical videos.<n>C2E uses entropy-maximizing decoders to compress images while preserving clinically relevant details, improving encoder performance without labeled data.
arXiv Detail & Related papers (2025-05-16T14:02:24Z) - AMNCutter: Affinity-Attention-Guided Multi-View Normalized Cutter for Unsupervised Surgical Instrument Segmentation [7.594796294925481]
We propose a label-free unsupervised model featuring a novel module named Multi-View Normalized Cutter (m-NCutter)
Our model is trained using a graph-cutting loss function that leverages patch affinities for supervision, eliminating the need for pseudo-labels.
We conduct comprehensive experiments across multiple SIS datasets to validate our approach's state-of-the-art (SOTA) performance, robustness, and exceptional potential as a pre-trained model.
arXiv Detail & Related papers (2024-11-06T06:33:55Z) - Low-Rank Continual Pyramid Vision Transformer: Incrementally Segment Whole-Body Organs in CT with Light-Weighted Adaptation [10.746776960260297]
We propose a new continual whole-body organ segmentation model with light-weighted low-rank adaptation (LoRA)
We first train and freeze a pyramid vision transformer (PVT) base segmentation model on the initial task, then continually add light-weighted trainable LoRA parameters to the frozen model for each new learning task.
Our proposed model continually segments new organs without catastrophic forgetting and meanwhile maintaining a low parameter increasing rate.
arXiv Detail & Related papers (2024-10-07T02:00:13Z) - ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning [54.68180752416519]
Panoptic segmentation is a cutting-edge computer vision task.
We introduce a novel and efficient method for continual panoptic segmentation based on Visual Prompt Tuning, dubbed ECLIPSE.
Our approach involves freezing the base model parameters and fine-tuning only a small set of prompt embeddings, addressing both catastrophic forgetting and plasticity.
arXiv Detail & Related papers (2024-03-29T11:31:12Z) - Continual Learning for Abdominal Multi-Organ and Tumor Segmentation [15.983529525062938]
We propose an innovative architecture designed specifically for continuous organ and tumor segmentation.
Our proposed design involves replacing the conventional output layer with a suite of lightweight, class-specific heads.
These heads enable independent predictions for newly introduced and previously learned classes, effectively minimizing the impact of new classes on old ones.
arXiv Detail & Related papers (2023-06-01T17:59:57Z) - Dynamic Conceptional Contrastive Learning for Generalized Category
Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data.
Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories.
One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data.
We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z) - From Forks to Forceps: A New Framework for Instance Segmentation of
Surgical Instruments [6.677634562400846]
Minimally invasive surgeries and related applications demand surgical tool classification and segmentation at the instance level.
Our research demonstrates that while the bounding box and segmentation mask are often accurate, the classification head mis-classifies the class label of the surgical instrument.
We present a new neural network framework that adds a classification module as a new stage to existing instance segmentation models.
arXiv Detail & Related papers (2022-11-26T21:26:42Z) - TraSeTR: Track-to-Segment Transformer with Contrastive Query for
Instance-level Instrument Segmentation in Robotic Surgery [60.439434751619736]
We propose TraSeTR, a Track-to-Segment Transformer that exploits tracking cues to assist surgical instrument segmentation.
TraSeTR jointly reasons about the instrument type, location, and identity with instance-level predictions.
The effectiveness of our method is demonstrated with state-of-the-art instrument type segmentation results on three public datasets.
arXiv Detail & Related papers (2022-02-17T05:52:18Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Co-Generation and Segmentation for Generalized Surgical Instrument
Segmentation on Unlabelled Data [49.419268399590045]
Surgical instrument segmentation for robot-assisted surgery is needed for accurate instrument tracking and augmented reality overlays.
Deep learning-based methods have shown state-of-the-art performance for surgical instrument segmentation, but their results depend on labelled data.
In this paper, we demonstrate the limited generalizability of these methods on different datasets, including human robot-assisted surgeries.
arXiv Detail & Related papers (2021-03-16T18:41:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.