"Train one, Classify one, Teach one" -- Cross-surgery transfer learning
for surgical step recognition
- URL: http://arxiv.org/abs/2102.12308v1
- Date: Wed, 24 Feb 2021 14:36:18 GMT
- Title: "Train one, Classify one, Teach one" -- Cross-surgery transfer learning
for surgical step recognition
- Authors: Daniel Neimark, Omri Bar, Maya Zohar, Gregory D. Hager, Dotan
Asselmann
- Abstract summary: We analyze, for the first time, surgical step recognition on four different laparoscopic surgeries.
We introduce a new architecture, the Time-Series Adaptation Network (TSAN), an architecture optimized for transfer learning of surgical step recognition.
Our proposed architecture leads to better performance compared to other possible architectures, reaching over 90% accuracy when transferring from laparoscopic Cholecystectomy to the other three procedure types.
- Score: 14.635480748841317
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prior work demonstrated the ability of machine learning to automatically
recognize surgical workflow steps from videos. However, these studies focused
on only a single type of procedure. In this work, we analyze, for the first
time, surgical step recognition on four different laparoscopic surgeries:
Cholecystectomy, Right Hemicolectomy, Sleeve Gastrectomy, and Appendectomy.
Inspired by the traditional apprenticeship model, in which surgical training is
based on the Halstedian method, we paraphrase the "see one, do one, teach one"
approach for the surgical intelligence domain as "train one, classify one,
teach one". In machine learning, this approach is often referred to as transfer
learning. To analyze the impact of transfer learning across different
laparoscopic procedures, we explore various time-series architectures and
examine their performance on each target domain. We introduce a new
architecture, the Time-Series Adaptation Network (TSAN), an architecture
optimized for transfer learning of surgical step recognition, and we show how
TSAN can be pre-trained using self-supervised learning on a Sequence Sorting
task. Such pre-training enables TSAN to learn workflow steps of a new
laparoscopic procedure type from only a small number of labeled samples from
the target procedure. Our proposed architecture leads to better performance
compared to other possible architectures, reaching over 90% accuracy when
transferring from laparoscopic Cholecystectomy to the other three procedure
types.
Related papers
- A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire
Navigation [0.0]
The treatment of cardiovascular diseases requires complex and challenging navigation of a guidewire and catheter.
This often leads to lengthy interventions during which the patient and clinician are exposed to X-ray radiation.
Deep Reinforcement Learning approaches have shown promise in learning this task and may be the key to automating catheter navigation during robotized interventions.
arXiv Detail & Related papers (2024-03-05T08:46:54Z) - Toward a Surgeon-in-the-Loop Ophthalmic Robotic Apprentice using Reinforcement and Imitation Learning [18.72371138886818]
We propose an image-guided approach for surgeon-centered autonomous agents during ophthalmic cataract surgery.
By integrating the surgeon's actions and preferences into the training process, our approach enables the robot to implicitly learn and adapt to the individual surgeon's unique techniques.
arXiv Detail & Related papers (2023-11-29T15:00:06Z) - Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip
Segmentation in Robotic Surgeries [29.201385352740555]
We propose a novel visual-kinematics graph learning framework to accurately segment the instrument tip given various surgical procedures.
Specifically, a graph learning framework is proposed to encode relational features of instrument parts from both image and kinematics.
A cross-modal contrastive loss is designed to incorporate robust geometric prior from kinematics to image for tip segmentation.
arXiv Detail & Related papers (2023-09-02T14:52:58Z) - GLSFormer : Gated - Long, Short Sequence Transformer for Step
Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches.
We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - Surgical Phase Recognition in Laparoscopic Cholecystectomy [57.929132269036245]
We propose a Transformer-based method that utilizes calibrated confidence scores for a 2-stage inference pipeline.
Our method outperforms the baseline model on the Cholec80 dataset, and can be applied to a variety of action segmentation methods.
arXiv Detail & Related papers (2022-06-14T22:55:31Z) - Quantification of Robotic Surgeries with Vision-Based Deep Learning [45.165919577877695]
We propose a unified deep learning framework, entitled Roboformer, which operates exclusively on videos recorded during surgery.
We validated our framework on four video-based datasets of two commonly-encountered types of steps within minimally-invasive robotic surgeries.
arXiv Detail & Related papers (2022-05-06T06:08:35Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.