Related papers: WetCat: Automating Skill Assessment in Wetlab Cataract Surgery Videos

WetCat: Automating Skill Assessment in Wetlab Cataract Surgery Videos

URL: http://arxiv.org/abs/2506.08896v1
Date: Tue, 10 Jun 2025 15:22:55 GMT
Title: WetCat: Automating Skill Assessment in Wetlab Cataract Surgery Videos
Authors: Negin Ghamsarian, Raphael Sznitman, Klaus Schoeffmann, Jens Kowal,
Abstract summary: WetCat is the first dataset of wetlab cataract surgery videos specifically curated for automated skill assessment.<n>WetCat comprises high-resolution recordings of surgeries performed by trainees on artificial eyes.<n>WetCat enables the development of interpretable, AI-driven evaluation tools aligned with established clinical metrics.
Score: 5.7977777220041204
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To meet the growing demand for systematic surgical training, wetlab environments have become indispensable platforms for hands-on practice in ophthalmology. Yet, traditional wetlab training depends heavily on manual performance evaluations, which are labor-intensive, time-consuming, and often subject to variability. Recent advances in computer vision offer promising avenues for automated skill assessment, enhancing both the efficiency and objectivity of surgical education. Despite notable progress in ophthalmic surgical datasets, existing resources predominantly focus on real surgeries or isolated tasks, falling short of supporting comprehensive skill evaluation in controlled wetlab settings. To address these limitations, we introduce WetCat, the first dataset of wetlab cataract surgery videos specifically curated for automated skill assessment. WetCat comprises high-resolution recordings of surgeries performed by trainees on artificial eyes, featuring comprehensive phase annotations and semantic segmentations of key anatomical structures. These annotations are meticulously designed to facilitate skill assessment during the critical capsulorhexis and phacoemulsification phases, adhering to standardized surgical skill assessment frameworks. By focusing on these essential phases, WetCat enables the development of interpretable, AI-driven evaluation tools aligned with established clinical metrics. This dataset lays a strong foundation for advancing objective, scalable surgical education and sets a new benchmark for automated workflow analysis and skill assessment in ophthalmology training. The dataset and annotations are publicly available in Synapse https://www.synapse.org/Synapse:syn66401174/files.

Related papers

Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities [65.66373425605278]
Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events.<n>Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases.<n>This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure.
arXiv Detail & Related papers (2025-04-26T15:37:22Z)
Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review [3.552525722519539]
Recent advancements in machine learning (ML) and deep learning (DL) have significantly enhanced surgical scene understanding within minimally invasive surgery (MIS)<n>This paper surveys the integration of state-of-the-art ML and DL technologies, including CNNs, Vision Transformers (ViTs), and foundational models like the Segment Anything Model (SAM)<n>The paper explores the challenges these technologies face, such as data variability and computational demands, and discusses ethical considerations and integration hurdles in clinical settings.
arXiv Detail & Related papers (2025-02-16T07:27:20Z)
OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining [60.75854609803651]
OphCLIP is a hierarchical retrieval-augmented vision-language pretraining framework for ophthalmic surgical workflow understanding.<n>OphCLIP learns both fine-grained and long-term visual representations by aligning short video clips with detailed narrative descriptions and full videos with structured titles.<n>Our OphCLIP also designs a retrieval-augmented pretraining framework to leverage the underexplored large-scale silent surgical procedure videos.
arXiv Detail & Related papers (2024-11-23T02:53:08Z)
Automated Surgical Skill Assessment in Endoscopic Pituitary Surgery using Real-time Instrument Tracking on a High-fidelity Bench-top Phantom [9.41936397281689]
Improved surgical skill is generally associated with improved patient outcomes, but assessment is subjective and labour-intensive. A new public dataset is introduced, focusing on simulated surgery, using the nasal phase of endoscopic pituitary surgery as an exemplar. A Multilayer Perceptron achieved 87% accuracy in predicting surgical skill level (novice or expert), with the "ratio of total procedure time to instrument visible time" correlated with higher surgical skill.
arXiv Detail & Related papers (2024-09-25T15:27:44Z)
Intuitive Surgical SurgToolLoc Challenge Results: 2022-2023 [55.40111320730479]
We have challenged the surgical data science community to solve difficult machine learning problems in the context of advanced RA applications.<n>Here we document the results of these challenges, focusing on surgical tool localization (SurgToolLoc)<n>The publicly released dataset that accompanies these challenges is detailed in a separate paper arXiv:2501.09209.
arXiv Detail & Related papers (2023-05-11T21:44:39Z)
Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection. Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target. Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z)
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community. The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z)
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z)
Video-based Formative and Summative Assessment of Surgical Tasks using Deep Learning [0.8612287536028312]
We propose a deep learning (DL) model that can automatically and objectively provide a high-stakes summative assessment of surgical skill execution. Formative assessment is generated using heatmaps of visual features that correlate with surgical performance.
arXiv Detail & Related papers (2022-03-17T20:07:48Z)
Federated Cycling (FedCy): Semi-supervised Federated Learning of Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos. We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z)
Simulation-to-Real domain adaptation with teacher-student learning for endoscopic instrument segmentation [1.1047993346634768]
We introduce a teacher-student learning approach that learns jointly from annotated simulation data and unlabeled real data. Empirical results on three datasets highlight the effectiveness of the proposed framework.
arXiv Detail & Related papers (2021-03-02T09:30:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.