CathAction: A Benchmark for Endovascular Intervention Understanding
- URL: http://arxiv.org/abs/2408.13126v2
- Date: Fri, 30 Aug 2024 11:45:09 GMT
- Title: CathAction: A Benchmark for Endovascular Intervention Understanding
- Authors: Baoru Huang, Tuan Vo, Chayun Kongtongvattana, Giulio Dagnino, Dennis Kundrat, Wenqiang Chi, Mohamed Abdelaziz, Trevor Kwok, Tudor Jianu, Tuong Do, Hieu Le, Minh Nguyen, Hoan Nguyen, Erman Tjiputra, Quang Tran, Jianyang Xie, Yanda Meng, Binod Bhattarai, Zhaorui Tan, Hongbin Liu, Hong Seng Gan, Wei Wang, Xi Yang, Qiufeng Wang, Jionglong Su, Kaizhu Huang, Angelos Stefanidis, Min Guo, Bo Du, Rong Tao, Minh Vu, Guoyan Zheng, Yalin Zheng, Francisco Vasconcelos, Danail Stoyanov, Daniel Elson, Ferdinando Rodriguez y Baena, Anh Nguyen,
- Abstract summary: CathAction is a large-scale dataset for catheterization understanding.
Our dataset encompasses approximately 500,000 annotated frames for catheterization action understanding and collision detection.
For each task, we benchmark recent related works in the field.
- Score: 74.58430707848527
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-time visual feedback from catheterization analysis is crucial for enhancing surgical safety and efficiency during endovascular interventions. However, existing datasets are often limited to specific tasks, small scale, and lack the comprehensive annotations necessary for broader endovascular intervention understanding. To tackle these limitations, we introduce CathAction, a large-scale dataset for catheterization understanding. Our CathAction dataset encompasses approximately 500,000 annotated frames for catheterization action understanding and collision detection, and 25,000 ground truth masks for catheter and guidewire segmentation. For each task, we benchmark recent related works in the field. We further discuss the challenges of endovascular intentions compared to traditional computer vision tasks and point out open research questions. We hope that CathAction will facilitate the development of endovascular intervention understanding methods that can be applied to real-world applications. The dataset is available at https://airvlab.github.io/cathaction/.
Related papers
- FedEFM: Federated Endovascular Foundation Model with Unseen Data [11.320026809291239]
This paper proposes a new method to train a foundation model in a decentralized federated learning setting for endovascular intervention.
We tackle the unseen data issue using differentiable Earth Mover's Distance within a knowledge distillation framework.
Our approach achieves new state-of-the-art results, contributing to advancements in endovascular intervention and robotic-assisted surgery.
arXiv Detail & Related papers (2025-01-28T14:46:38Z) - In-context learning for medical image segmentation [0.4143603294943439]
In-context Cascade (ICS) is a novel method that minimizes annotation requirements while achieving high segmentation accuracy for sequential medical images.
ICS builds on the UniverSeg framework, which performs few-shot segmentation using support images without additional training.
We evaluate the proposed method on the HVSMR dataset, which includes segmentation tasks for eight cardiac regions.
arXiv Detail & Related papers (2024-12-17T19:59:08Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Hypergraph-Transformer (HGT) for Interactive Event Prediction in
Laparoscopic and Robotic Surgery [50.3022015601057]
We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video.
We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets.
Our results demonstrate the superiority of our approach compared to unstructured alternatives.
arXiv Detail & Related papers (2024-02-03T00:58:05Z) - Pixel-Wise Recognition for Holistic Surgical Scene Understanding [31.338288460529046]
This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset.
GraSP is a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity.
We introduce the Transformers for Actions, Phases, Steps, and Instrument (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals.
arXiv Detail & Related papers (2024-01-20T09:09:52Z) - CholecTrack20: A Dataset for Multi-Class Multiple Tool Tracking in
Laparoscopic Surgery [1.8076340162131013]
CholecTrack20 is an extensive dataset meticulously annotated for multi-class multi-tool tracking across three perspectives.
The dataset comprises 20 laparoscopic videos with over 35,000 frames and 65,000 annotated tool instances.
arXiv Detail & Related papers (2023-12-12T15:18:15Z) - Adaptive Semi-Supervised Segmentation of Brain Vessels with Ambiguous
Labels [63.415444378608214]
Our approach incorporates innovative techniques including progressive semi-supervised learning, adaptative training strategy, and boundary enhancement.
Experimental results on 3DRA datasets demonstrate the superiority of our method in terms of mesh-based segmentation metrics.
arXiv Detail & Related papers (2023-08-07T14:16:52Z) - Cross-Dataset Adaptation for Instrument Classification in Cataract
Surgery Videos [54.1843419649895]
State-of-the-art models, which perform this task well on a particular dataset, perform poorly when tested on another dataset.
We propose a novel end-to-end Unsupervised Domain Adaptation (UDA) method called the Barlow Adaptor.
In addition, we introduce a novel loss called the Barlow Feature Alignment Loss (BFAL) which aligns features across different domains.
arXiv Detail & Related papers (2023-07-31T18:14:18Z) - Task-Aware Active Learning for Endoscopic Image Analysis [18.230148396607625]
We investigate an active learning paradigm to reduce the number of training examples.
We propose a novel task-aware active learning pipeline and applied for two important tasks in endoscopic image analysis.
arXiv Detail & Related papers (2022-04-07T13:36:45Z) - External Attention Assisted Multi-Phase Splenic Vascular Injury
Segmentation with Limited Data [72.99534552950138]
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma.
accurate segmentation of splenic vascular injury is challenging for the following reasons.
arXiv Detail & Related papers (2022-01-04T02:35:56Z) - Simulation-to-Real domain adaptation with teacher-student learning for
endoscopic instrument segmentation [1.1047993346634768]
We introduce a teacher-student learning approach that learns jointly from annotated simulation data and unlabeled real data.
Empirical results on three datasets highlight the effectiveness of the proposed framework.
arXiv Detail & Related papers (2021-03-02T09:30:28Z) - End-to-End Real-time Catheter Segmentation with Optical Flow-Guided
Warping during Endovascular Intervention [26.467626509096043]
We present FW-Net, an end-to-end and real-time deep learning framework for endovascular intervention.
We show that by effectively learning temporal continuity, the network can successfully segment and track the catheters in real-time sequences using only raw ground-truth for training.
arXiv Detail & Related papers (2020-06-16T12:53:27Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.