Simulation-to-Real domain adaptation with teacher-student learning for
  endoscopic instrument segmentation
        - URL: http://arxiv.org/abs/2103.01593v1
- Date: Tue, 2 Mar 2021 09:30:28 GMT
- Title: Simulation-to-Real domain adaptation with teacher-student learning for
  endoscopic instrument segmentation
- Authors: Manish Sahu, Anirban Mukhopadhyay, Stefan Zachow
- Abstract summary: We introduce a teacher-student learning approach that learns jointly from annotated simulation data and unlabeled real data.
 Empirical results on three datasets highlight the effectiveness of the proposed framework.
- Score: 1.1047993346634768
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract:   Purpose: Segmentation of surgical instruments in endoscopic videos is
essential for automated surgical scene understanding and process modeling.
However, relying on fully supervised deep learning for this task is challenging
because manual annotation occupies valuable time of the clinical experts.
  Methods: We introduce a teacher-student learning approach that learns jointly
from annotated simulation data and unlabeled real data to tackle the erroneous
learning problem of the current consistency-based unsupervised domain
adaptation framework.
  Results: Empirical results on three datasets highlight the effectiveness of
the proposed framework over current approaches for the endoscopic instrument
segmentation task. Additionally, we provide analysis of major factors affecting
the performance on all datasets to highlight the strengths and failure modes of
our approach.
  Conclusion: We show that our proposed approach can successfully exploit the
unlabeled real endoscopic video frames and improve generalization performance
over pure simulation-based training and the previous state-of-the-art. This
takes us one step closer to effective segmentation of surgical tools in the
annotation scarce setting.
 
      
        Related papers
        - Surgical Foundation Model Leveraging Compression and Entropy   Maximization for Image-Guided Surgical Assistance [50.486523249499115]
 Real-time video understanding is critical to guide procedures in minimally invasive surgery (MIS)<n>We propose Compress-to-Explore (C2E), a novel self-supervised framework to learn compact, informative representations from surgical videos.<n>C2E uses entropy-maximizing decoders to compress images while preserving clinically relevant details, improving encoder performance without labeled data.
 arXiv  Detail & Related papers  (2025-05-16T14:02:24Z)
- Federated EndoViT: Pretraining Vision Transformers via Federated   Learning on Endoscopic Image Collections [35.585690280385826]
 We adapt the Masked Autoencoder for federated learning, enhancing Sharpness-Aware Minimization (FedSAM) and Weight Averaging.
Our findings demonstrate that integrating FedSAM into the federated MAE approach improves pretraining, leading to a reduction in reconstruction loss per patch.
These findings highlight the potential of federated learning for privacy-preserving training of surgical foundation models.
 arXiv  Detail & Related papers  (2025-04-23T10:54:32Z)
- Boosting Few-Shot Learning with Disentangled Self-Supervised Learning   and Meta-Learning for Medical Image Classification [8.975676404678374]
 We present a strategy for improving the performance and generalization capabilities of models trained in low-data regimes.
The proposed method starts with a pre-training phase, where features learned in a self-supervised learning setting are disentangled to improve the robustness of the representations for downstream tasks.
We then introduce a meta-fine-tuning step, leveraging related classes between meta-training and meta-testing phases but varying the level.
 arXiv  Detail & Related papers  (2024-03-26T09:36:20Z)
- Pixel-Wise Recognition for Holistic Surgical Scene Understanding [31.338288460529046]
 This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset.
GraSP is a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity.
We introduce the Transformers for Actions, Phases, Steps, and Instrument (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals.
 arXiv  Detail & Related papers  (2024-01-20T09:09:52Z)
- Jumpstarting Surgical Computer Vision [2.7396997668655163]
 We employ self-supervised learning to flexibly leverage diverse surgical datasets.
We study phase recognition and critical view of safety in laparoscopic cholecystectomy and laparoscopic hysterectomy.
The composition of pre-training datasets can severely affect the effectiveness of SSL methods for various downstream tasks.
 arXiv  Detail & Related papers  (2023-12-10T18:54:16Z)
- Self-trained Panoptic Segmentation [0.0]
 Panoptic segmentation is an important computer vision task which combines semantic and instance segmentation.
Recent advancements in self-supervised learning approaches have shown great potential in leveraging synthetic and unlabelled data to generate pseudo-labels.
The aim of this work is to develop a framework to perform embedding-based self-supervised panoptic segmentation using self-training in a synthetic-to-real domain adaptation problem setting.
 arXiv  Detail & Related papers  (2023-11-17T17:06:59Z)
- Adaptive Semi-Supervised Segmentation of Brain Vessels with Ambiguous
  Labels [63.415444378608214]
 Our approach incorporates innovative techniques including progressive semi-supervised learning, adaptative training strategy, and boundary enhancement.
 Experimental results on 3DRA datasets demonstrate the superiority of our method in terms of mesh-based segmentation metrics.
 arXiv  Detail & Related papers  (2023-08-07T14:16:52Z)
- Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
 Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
 arXiv  Detail & Related papers  (2022-07-01T14:17:11Z)
- Rethinking Surgical Instrument Segmentation: A Background Image Can Be
  All You Need [18.830738606514736]
 Data scarcity and imbalance have heavily affected the model accuracy and limited the design and deployment of deep learning-based surgical applications.
We propose a one-to-many data generation solution that gets rid of the complicated and expensive process of data collection and annotation from robotic surgery.
Our empirical analysis suggests that without the high cost of data collection and annotation, we can achieve decent surgical instrument segmentation performance.
 arXiv  Detail & Related papers  (2022-06-23T16:22:56Z)
- Federated Cycling (FedCy): Semi-supervised Federated Learning of
  Surgical Phases [57.90226879210227]
 FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
 arXiv  Detail & Related papers  (2022-03-14T17:44:53Z)
- Real-time landmark detection for precise endoscopic submucosal
  dissection via shape-aware relation network [51.44506007844284]
 We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
 arXiv  Detail & Related papers  (2021-11-08T07:57:30Z)
- Endo-Sim2Real: Consistency learning-based domain adaptation for
  instrument segmentation [1.086731011437779]
 Surgical tool segmentation in endoscopic videos is an important component of computer assisted interventions systems.
Recent success of image-based solutions using fully-supervised deep learning approaches can be attributed to the collection of big labeled datasets.
Computer simulations could alleviate the manual labeling problem, however, models trained on simulated data do not generalize to real data.
This work proposes a consistency-based framework for joint learning of simulated and real (unlabeled) endoscopic data.
 arXiv  Detail & Related papers  (2020-07-22T16:18:11Z)
- Towards Unsupervised Learning for Instrument Segmentation in Robotic
  Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
 We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
 arXiv  Detail & Related papers  (2020-07-09T01:39:39Z)
- Automatic Data Augmentation via Deep Reinforcement Learning for
  Effective Kidney Tumor Segmentation [57.78765460295249]
 We develop a novel automatic learning-based data augmentation method for medical image segmentation.
In our method, we innovatively combine the data augmentation module and the subsequent segmentation module in an end-to-end training manner with a consistent loss.
We extensively evaluated our method on CT kidney tumor segmentation which validated the promising results of our method.
 arXiv  Detail & Related papers  (2020-02-22T14:10:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.