The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges
and methods
- URL: http://arxiv.org/abs/2104.03178v1
- Date: Wed, 7 Apr 2021 15:11:51 GMT
- Title: The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges
and methods
- Authors: Vivek Singh Bawa, Gurkirt Singh, Francis KapingA, Inna
Skarga-Bandurova, Elettra Oleari, Alice Leporini, Carmela Landolfo, Pengfei
Zhao, Xi Xiang, Gongning Luo, Kuanquan Wang, Liangzhi Li, Bowen Wang, Shang
Zhao, Li Li, Armando Stabile, Francesco Setti, Riccardo Muradore, Fabio
Cuzzolin
- Abstract summary: This paper presents ESAD, the first large-scale dataset designed to tackle the problem of surgeon action detection in endoscopic minimally invasive surgery.
The dataset provides bounding box annotation for 21 action classes on real endoscopic video frames captured during prostatectomy, and was used as the basis of a recent MIDL 2020 challenge.
- Score: 15.833413083110903
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: For an autonomous robotic system, monitoring surgeon actions and assisting
the main surgeon during a procedure can be very challenging. The challenges
come from the peculiar structure of the surgical scene, the greater similarity
in appearance of actions performed via tools in a cavity compared to, say,
human actions in unconstrained environments, as well as from the motion of the
endoscopic camera. This paper presents ESAD, the first large-scale dataset
designed to tackle the problem of surgeon action detection in endoscopic
minimally invasive surgery. ESAD aims at contributing to increase the
effectiveness and reliability of surgical assistant robots by realistically
testing their awareness of the actions performed by a surgeon. The dataset
provides bounding box annotation for 21 action classes on real endoscopic video
frames captured during prostatectomy, and was used as the basis of a recent
MIDL 2020 challenge. We also present an analysis of the dataset conducted using
the baseline model which was released as part of the challenge, and a
description of the top performing models submitted to the challenge together
with the results they obtained. This study provides significant insight into
what approaches can be effective and can be extended further. We believe that
ESAD will serve in the future as a useful benchmark for all researchers active
in surgeon action detection and assistive robotics at large.
Related papers
- VISAGE: Video Synthesis using Action Graphs for Surgery [34.21344214645662]
We introduce the novel task of future video generation in laparoscopic surgery.
Our proposed method, VISAGE, leverages the power of action scene graphs to capture the sequential nature of laparoscopic procedures.
Results of our experiments demonstrate high-fidelity video generation for laparoscopy procedures.
arXiv Detail & Related papers (2024-10-23T10:28:17Z) - PitVis-2023 Challenge: Workflow Recognition in videos of Endoscopic Pituitary Surgery [46.2901962659261]
The Pituitary Vision (VisVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery.
This is a unique task when compared to other minimally invasive surgeries due to the smaller working space.
There were 18-s from 9-teams across 6-countries, using a variety of deep learning models.
arXiv Detail & Related papers (2024-09-02T11:38:06Z) - Hypergraph-Transformer (HGT) for Interactive Event Prediction in
Laparoscopic and Robotic Surgery [50.3022015601057]
We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video.
We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets.
Our results demonstrate the superiority of our approach compared to unstructured alternatives.
arXiv Detail & Related papers (2024-02-03T00:58:05Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition
in the Operating Room [6.132617753806978]
We propose a new sample-efficient and object-based approach for surgical activity recognition in the OR.
Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR.
arXiv Detail & Related papers (2023-12-19T15:33:57Z) - Surgical tool classification and localization: results and methods from
the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge.
The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools.
We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z) - Demonstration-Guided Reinforcement Learning with Efficient Exploration
for Task Automation of Surgical Robot [54.80144694888735]
We introduce Demonstration-guided EXploration (DEX), an efficient reinforcement learning algorithm.
Our method estimates expert-like behaviors with higher values to facilitate productive interactions.
Experiments on $10$ surgical manipulation tasks from SurRoL, a comprehensive surgical simulation platform, demonstrate significant improvements.
arXiv Detail & Related papers (2023-02-20T05:38:54Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Surgical Visual Domain Adaptation: Results from the MICCAI 2020
SurgVisDom Challenge [9.986124942784969]
This work seeks to explore the potential for visual domain adaptation in surgery to overcome data privacy concerns.
In particular, we propose to use video from virtual reality (VR) simulations of surgical exercises to develop algorithms to recognize tasks in a clinical-like setting.
We present the performance of the different approaches to solve visual domain adaptation developed by challenge participants.
arXiv Detail & Related papers (2021-02-26T18:45:28Z) - ESAD: Endoscopic Surgeon Action Detection Dataset [10.531648619593572]
We aim to make surgical assistant robots safer by making them aware about the actions of surgeon, so it can take appropriate assisting actions.
We introduce a challenging dataset for surgeon action detection in real-world endoscopic videos.
arXiv Detail & Related papers (2020-06-12T13:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.