CholecTriplet2022: Show me a tool and tell me the triplet -- an
endoscopic vision challenge for surgical action triplet detection
- URL: http://arxiv.org/abs/2302.06294v2
- Date: Fri, 14 Jul 2023 19:06:27 GMT
- Title: CholecTriplet2022: Show me a tool and tell me the triplet -- an
endoscopic vision challenge for surgical action triplet detection
- Authors: Chinedu Innocent Nwoye, Tong Yu, Saurav Sharma, Aditya Murali, Deepak
Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine
Yamlahi, Finn-Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena
R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege \"Ozsoy, Shuangchun
Gui, Han Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai,
Ziheng Wang, Guo Rui, Melanie Schellenberg, Jo\~ao L. Vila\c{c}a, Tobias
Czempiel, Zhenkun Wang, Debdoot Sheet, Shrawan Kumar Thapa, Max Berniker,
Patrick Godau, Pedro Morais, Sudarshan Regmi, Thuy Nuong Tran, Jaime Fonseca,
Jan-Hinrich N\"olke, Estev\~ao Lima, Eduard Vazquez, Lena Maier-Hein, Nassir
Navab, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Didier Mutter,
Nicolas Padoy
- Abstract summary: This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection.
It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool) as the key actors, and the modeling of each tool-activity in the form of instrument, verb, target> triplet.
- Score: 41.66666272822756
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Formalizing surgical activities as triplets of the used instruments, actions
performed, and target anatomies is becoming a gold standard approach for
surgical activity modeling. The benefit is that this formalization helps to
obtain a more detailed understanding of tool-tissue interaction which can be
used to develop better Artificial Intelligence assistance for image-guided
surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021
have put together techniques aimed at recognizing these triplets from surgical
footage. Estimating also the spatial locations of the triplets would offer a
more precise intraoperative context-aware decision support for
computer-assisted intervention. This paper presents the CholecTriplet2022
challenge, which extends surgical action triplet modeling from recognition to
detection. It includes weakly-supervised bounding box localization of every
visible surgical instrument (or tool), as the key actors, and the modeling of
each tool-activity in the form of <instrument, verb, target> triplet. The paper
describes a baseline method and 10 new deep learning algorithms presented at
the challenge to solve the task. It also provides thorough methodological
comparisons of the methods, an in-depth analysis of the obtained results across
multiple metrics, visual and procedural challenges; their significance, and
useful insights for future research directions and applications in surgery.
Related papers
- SURGIVID: Annotation-Efficient Surgical Video Object Discovery [42.16556256395392]
We propose an annotation-efficient framework for the semantic segmentation of surgical scenes.
We employ image-based self-supervised object discovery to identify the most salient tools and anatomical structures in surgical videos.
Our unsupervised setup reinforced with only 36 annotation labels indicates comparable localization performance with fully-supervised segmentation models.
arXiv Detail & Related papers (2024-09-12T07:12:20Z) - Monocular pose estimation of articulated surgical instruments in open surgery [0.873811641236639]
This work presents a novel approach to monocular 6D pose estimation of surgical instruments in open surgery, addressing challenges such as object articulations, symmetries, and lack of annotated real-world data.
The proposed approach consists of three main components: (1) synthetic data generation using 3D modeling of surgical tools with articulation rigging; (2) a tailored pose estimation framework combining object detection with pose estimation and a hybrid geometric fusion strategy; and (3) a training strategy that utilizes both synthetic and real unannotated data, employing domain adaptation on real video data using automatically generated pseudo-labels.
arXiv Detail & Related papers (2024-07-16T19:47:35Z) - Surgical Text-to-Image Generation [1.958913666074613]
We adapt text-to-image generative models for the surgical domain using the CholecT50 dataset.
We develop Surgical Imagen to generate photorealistic and activity-aligned surgical images from triplet-based textual prompts.
arXiv Detail & Related papers (2024-07-12T12:49:11Z) - Surgical Triplet Recognition via Diffusion Model [59.50938852117371]
Surgical triplet recognition is an essential building block to enable next-generation context-aware operating rooms.
We propose Difft, a new generative framework for surgical triplet recognition employing the diffusion model.
Experiments on the CholecT45 and CholecT50 datasets show the superiority of the proposed method in achieving a new state-of-the-art performance for surgical triplet recognition.
arXiv Detail & Related papers (2024-06-19T04:43:41Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - Surgical tool classification and localization: results and methods from
the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge.
The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools.
We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.