Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
- URL: http://arxiv.org/abs/2207.00449v3
- Date: Wed, 31 May 2023 09:08:11 GMT
- Title: Dissecting Self-Supervised Learning Methods for Surgical Computer Vision
- Authors: Sanat Ramesh, Vinkle Srivastav, Deepak Alapatt, Tong Yu, Aditya
Murali, Luca Sestini, Chinedu Innocent Nwoye, Idris Hamoud, Saurav Sharma,
Antoine Fleurentin, Georgios Exarchakis, Alexandros Karargyris, Nicolas Padoy
- Abstract summary: Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
- Score: 51.370873913181605
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The field of surgical computer vision has undergone considerable
breakthroughs in recent years with the rising popularity of deep neural
network-based methods. However, standard fully-supervised approaches for
training such models require vast amounts of annotated data, imposing a
prohibitively high cost; especially in the clinical domain. Self-Supervised
Learning (SSL) methods, which have begun to gain traction in the general
computer vision community, represent a potential solution to these annotation
costs, allowing to learn useful representations from only unlabeled data.
Still, the effectiveness of SSL methods in more complex and impactful domains,
such as medicine and surgery, remains limited and unexplored. In this work, we
address this critical need by investigating four state-of-the-art SSL methods
(MoCo v2, SimCLR, DINO, SwAV) in the context of surgical computer vision. We
present an extensive analysis of the performance of these methods on the
Cholec80 dataset for two fundamental and popular tasks in surgical context
understanding, phase recognition and tool presence detection. We examine their
parameterization, then their behavior with respect to training data quantities
in semi-supervised settings. Correct transfer of these methods to surgery, as
described and conducted in this work, leads to substantial performance gains
over generic uses of SSL - up to 7.4% on phase recognition and 20% on tool
presence detection - as well as state-of-the-art semi-supervised phase
recognition approaches by up to 14%. Further results obtained on a highly
diverse selection of surgical datasets exhibit strong generalization
properties. The code is available at
https://github.com/CAMMA-public/SelfSupSurg.
Related papers
- Jumpstarting Surgical Computer Vision [2.7396997668655163]
We employ self-supervised learning to flexibly leverage diverse surgical datasets.
We study phase recognition and critical view of safety in laparoscopic cholecystectomy and laparoscopic hysterectomy.
The composition of pre-training datasets can severely affect the effectiveness of SSL methods for various downstream tasks.
arXiv Detail & Related papers (2023-12-10T18:54:16Z) - Shifting to Machine Supervision: Annotation-Efficient Semi and Self-Supervised Learning for Automatic Medical Image Segmentation and Classification [9.67209046726903]
We introduce the S4MI pipeline, a novel approach that leverages advancements in self-supervised and semi-supervised learning.
Our study benchmarks these techniques on three distinct medical imaging datasets to evaluate their effectiveness in classification and segmentation tasks.
Remarkably, the semi-supervised approach demonstrated superior outcomes in segmentation, outperforming fully-supervised methods while using 50% fewer labels across all datasets.
arXiv Detail & Related papers (2023-11-17T04:04:29Z) - Robust Surgical Tools Detection in Endoscopic Videos with Noisy Data [2.566694420723775]
We propose a systematic methodology for developing robust models for surgical tool detection using noisy data.
Our methodology introduces two key innovations: (1) an intelligent active learning strategy for minimal dataset identification and label correction by human experts; and (2) an assembling strategy for a student-teacher model-based self-training framework.
The proposed methodology achieves an average F1-score of 85.88% for the ensemble model-based self-training with class weights, and 80.88% without class weights for noisy labels.
arXiv Detail & Related papers (2023-07-03T08:12:56Z) - Benchmarking Self-Supervised Learning on Diverse Pathology Datasets [10.868779327544688]
Self-supervised learning has shown to be an effective method for utilizing unlabeled data.
We execute the largest-scale study of SSL pre-training on pathology image data.
For the first time, we apply SSL to the challenging task of nuclei instance segmentation.
arXiv Detail & Related papers (2022-12-09T06:38:34Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - Explaining Clinical Decision Support Systems in Medical Imaging using
Cycle-Consistent Activation Maximization [112.2628296775395]
Clinical decision support using deep neural networks has become a topic of steadily growing interest.
clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend.
We propose a novel decision explanation scheme based on CycleGAN activation which generates high-quality visualizations of classifier decisions even in smaller data sets.
arXiv Detail & Related papers (2020-10-09T14:39:27Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z) - LRTD: Long-Range Temporal Dependency based Active Learning for Surgical
Workflow Recognition [67.86810761677403]
We propose a novel active learning method for cost-effective surgical video analysis.
Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency.
We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task.
arXiv Detail & Related papers (2020-04-21T09:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.