Related papers: ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room

ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room

URL: http://arxiv.org/abs/2312.12250v1
Date: Tue, 19 Dec 2023 15:33:57 GMT
Title: ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room
Authors: Idris Hamoud, Muhammad Abdullah Jamal, Vinkle Srivastav, Didier Mutter, Nicolas Padoy, Omid Mohareri
Abstract summary: We propose a new sample-efficient and object-based approach for surgical activity recognition in the OR. Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR.
Score: 6.132617753806978
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Surgical robotics holds much promise for improving patient safety and clinician experience in the Operating Room (OR). However, it also comes with new challenges, requiring strong team coordination and effective OR management. Automatic detection of surgical activities is a key requirement for developing AI-based intelligent tools to tackle these challenges. The current state-of-the-art surgical activity recognition methods however operate on image-based representations and depend on large-scale labeled datasets whose collection is time-consuming and resource-expensive. This work proposes a new sample-efficient and object-based approach for surgical activity recognition in the OR. Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR. We conduct experiments in a low-data regime study for long video activity recognition. We also benchmark our method againstother object-centric approaches on clip-level action classification and show superior performance.

Related papers

Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities [65.66373425605278]
Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events. Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases. This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure.
arXiv Detail & Related papers (2025-04-26T15:37:22Z)
OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery [13.843251369739908]
We introduce an innovative Open-Set Surgical Activity Recognition (OSSAR) framework. Our solution leverages the hyperspherical reciprocal point strategy to enhance the distinction between known and unknown classes in the feature space. To support our assertions, we establish an open-set surgical activity benchmark utilizing the public JIGSAWS dataset.
arXiv Detail & Related papers (2024-02-10T16:23:12Z)
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP) The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain. A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z)
Adaptation of Surgical Activity Recognition Models Across Operating Rooms [10.625208343893911]
We study the generalizability of surgical activity recognition models across operating rooms. We propose a new domain adaptation method to improve the performance of the surgical activity recognition model.
arXiv Detail & Related papers (2022-07-07T04:41:34Z)
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community. The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z)
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z)
The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges and methods [15.833413083110903]
This paper presents ESAD, the first large-scale dataset designed to tackle the problem of surgeon action detection in endoscopic minimally invasive surgery. The dataset provides bounding box annotation for 21 action classes on real endoscopic video frames captured during prostatectomy, and was used as the basis of a recent MIDL 2020 challenge.
arXiv Detail & Related papers (2021-04-07T15:11:51Z)
One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games. The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions. We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z)
Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions. Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures. The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)
Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and Progress Prediction [17.63619129438996]
We propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress. We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.
arXiv Detail & Related papers (2020-03-10T14:28:02Z)
Automatic Gesture Recognition in Robot-assisted Surgery with Reinforcement Learning and Tree Search [63.07088785532908]
We propose a framework based on reinforcement learning and tree search for joint surgical gesture segmentation and classification. Our framework consistently outperforms the existing methods on the suturing task of JIGSAWS dataset in terms of accuracy, edit score and F1 score.
arXiv Detail & Related papers (2020-02-20T13:12:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.