Learning Invariant Representation of Tasks for Robust Surgical State
Estimation
- URL: http://arxiv.org/abs/2102.09119v1
- Date: Thu, 18 Feb 2021 02:32:50 GMT
- Title: Learning Invariant Representation of Tasks for Robust Surgical State
Estimation
- Authors: Yidan Qin, Max Allan, Yisong Yue, Joel W. Burdick, Mahdi Azizian
- Abstract summary: We propose StiseNet, a Surgical Task Invariance State Estimation Network.
StiseNet minimizes the effects of variations in surgical technique and operating environments inherent to RAS datasets.
It is shown to outperform state-of-the-art state estimation methods on three datasets.
- Score: 39.515036686428836
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Surgical state estimators in robot-assisted surgery (RAS) - especially those
trained via learning techniques - rely heavily on datasets that capture surgeon
actions in laboratory or real-world surgical tasks. Real-world RAS datasets are
costly to acquire, are obtained from multiple surgeons who may use different
surgical strategies, and are recorded under uncontrolled conditions in highly
complex environments. The combination of high diversity and limited data calls
for new learning methods that are robust and invariant to operating conditions
and surgical techniques. We propose StiseNet, a Surgical Task Invariance State
Estimation Network with an invariance induction framework that minimizes the
effects of variations in surgical technique and operating environments inherent
to RAS datasets. StiseNet's adversarial architecture learns to separate
nuisance factors from information needed for surgical state estimation.
StiseNet is shown to outperform state-of-the-art state estimation methods on
three datasets (including a new real-world RAS dataset: HERNIA-20).
Related papers
- Realistic Data Generation for 6D Pose Estimation of Surgical Instruments [4.226502078427161]
6D pose estimation of surgical instruments is critical to enable the automatic execution of surgical maneuvers.
In household and industrial settings, synthetic data, generated with 3D computer graphics software, has been shown as an alternative to minimize annotation costs.
We propose an improved simulation environment for surgical robotics that enables the automatic generation of large and diverse datasets.
arXiv Detail & Related papers (2024-06-11T14:59:29Z) - OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted
Surgery [13.843251369739908]
We introduce an innovative Open-Set Surgical Activity Recognition (OSSAR) framework.
Our solution leverages the hyperspherical reciprocal point strategy to enhance the distinction between known and unknown classes in the feature space.
To support our assertions, we establish an open-set surgical activity benchmark utilizing the public JIGSAWS dataset.
arXiv Detail & Related papers (2024-02-10T16:23:12Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - Cross-Dataset Adaptation for Instrument Classification in Cataract
Surgery Videos [54.1843419649895]
State-of-the-art models, which perform this task well on a particular dataset, perform poorly when tested on another dataset.
We propose a novel end-to-end Unsupervised Domain Adaptation (UDA) method called the Barlow Adaptor.
In addition, we introduce a novel loss called the Barlow Feature Alignment Loss (BFAL) which aligns features across different domains.
arXiv Detail & Related papers (2023-07-31T18:14:18Z) - Surgical tool classification and localization: results and methods from
the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge.
The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools.
We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Real-time Informative Surgical Skill Assessment with Gaussian Process
Learning [12.019641896240245]
This work presents a novel Gaussian Process Learning-based automatic objective surgical skill assessment method for ESSBSs.
The proposed method projects the instrument movements into the endoscope coordinate to reduce the data dimensionality.
The experimental results show that the proposed method reaches 100% prediction precision for complete surgical procedures and 90% precision for real-time prediction assessment.
arXiv Detail & Related papers (2021-12-05T15:35:40Z) - Towards Unified Surgical Skill Assessment [18.601526803020885]
We propose a unified multi-path framework for automatic surgical skill assessment.
We conduct experiments on the JIGSAWS dataset of simulated surgical tasks, and a new clinical dataset of real laparoscopic surgeries.
arXiv Detail & Related papers (2021-06-02T09:06:43Z) - Temporal Segmentation of Surgical Sub-tasks through Deep Learning with
Multiple Data Sources [14.677001578868872]
We propose a unified surgical state estimation model based on the actions performed or events occurred as the task progresses.
We evaluate our model on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS) and a more complex dataset involving robotic intra-operative ultrasound (RIOUS) imaging.
Our model achieves a superior frame-wise state estimation accuracy up to 89.4%, which improves the state-of-the-art surgical state estimation models.
arXiv Detail & Related papers (2020-02-07T17:49:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.