Related papers: Towards Better Surgical Instrument Segmentation in Endoscopic Vision: Multi-Angle Feature Aggregation and Contour Supervision

Towards Better Surgical Instrument Segmentation in Endoscopic Vision: Multi-Angle Feature Aggregation and Contour Supervision

URL: http://arxiv.org/abs/2002.10675v2
Date: Tue, 11 Aug 2020 03:20:35 GMT
Title: Towards Better Surgical Instrument Segmentation in Endoscopic Vision: Multi-Angle Feature Aggregation and Contour Supervision
Authors: Fangbo Qin, Shan Lin, Yangming Li, Randall A. Bly, Kris S. Moe, Blake Hannaford
Abstract summary: We propose a general embeddable approach to improve current deep neural network (DNN) segmentation models. The proposed method is validated with ablation experiments on the novel Sinus-Surgery datasets collected from surgeons' operations.
Score: 22.253074722129053
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate and real-time surgical instrument segmentation is important in the endoscopic vision of robot-assisted surgery, and significant challenges are posed by frequent instrument-tissue contacts and continuous change of observation perspective. For these challenging tasks more and more deep neural networks (DNN) models are designed in recent years. We are motivated to propose a general embeddable approach to improve these current DNN segmentation models without increasing the model parameter number. Firstly, observing the limited rotation-invariance performance of DNN, we proposed the Multi-Angle Feature Aggregation (MAFA) method, leveraging active image rotation to gain richer visual cues and make the prediction more robust to instrument orientation changes. Secondly, in the end-to-end training stage, the auxiliary contour supervision is utilized to guide the model to learn the boundary awareness, so that the contour shape of segmentation mask is more precise. The proposed method is validated with ablation experiments on the novel Sinus-Surgery datasets collected from surgeons' operations, and is compared to the existing methods on a public dataset collected with a da Vinci Xi Robot.

Related papers

Semantic Segmentation for Preoperative Planning in Transcatheter Aortic Valve Replacement [61.573750959726475]
We consider medical guidelines for preoperative planning of the transcatheter aortic valve replacement (TAVR) and identify tasks that may be supported via semantic segmentation models.<n>We first derive fine-grained TAVR-relevant pseudo-labels from coarse-grained anatomical information, in order to train segmentation models and quantify how well they are able to find these structures in the scans.
arXiv Detail & Related papers (2025-07-22T13:24:45Z)
EndoARSS: Adapting Spatially-Aware Foundation Model for Efficient Activity Recognition and Semantic Segmentation in Endoscopic Surgery [11.286605039002419]
Endoscopic surgery is the gold standard for robotic-assisted minimally invasive surgery.<n>Traditional deep learning models often struggle with cross-activity interference, leading to suboptimal performance in each downstream task.<n>We propose EndoARSS, a novel multi-task learning framework specifically designed for endoscopy surgery activity recognition and semantic segmentation.
arXiv Detail & Related papers (2025-06-07T15:18:43Z)
Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance [50.486523249499115]
Real-time video understanding is critical to guide procedures in minimally invasive surgery (MIS)<n>We propose Compress-to-Explore (C2E), a novel self-supervised framework to learn compact, informative representations from surgical videos.<n>C2E uses entropy-maximizing decoders to compress images while preserving clinically relevant details, improving encoder performance without labeled data.
arXiv Detail & Related papers (2025-05-16T14:02:24Z)
Motion-enhancement to Echocardiography Segmentation via Inserting a Temporal Attention Module: An Efficient, Adaptable, and Scalable Approach [4.923733944174007]
We present a novel, computation-efficient alternative where a temporal attention module extracts feature interactions multiple times. The module can be seamlessly integrated into a wide range of existing CNN- or Transformer-based networks. Our results confirm TAM's robustness, scalability, and generalizability across diverse datasets and backbones.
arXiv Detail & Related papers (2025-01-24T21:35:24Z)
Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy [2.847280871973632]
Esophageal cancer is among the most common types of cancer worldwide. In recent years, robot-assisted minimally invasive esophagectomy has emerged as a promising alternative. Computer-aided anatomy recognition holds promise for improving surgical navigation.
arXiv Detail & Related papers (2024-12-04T15:32:37Z)
Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering. Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively. We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z)
Enhancing AI Diagnostics: Autonomous Lesion Masking via Semi-Supervised Deep Learning [1.4053129774629076]
This study presents an unsupervised domain adaptation method aimed at autonomously generating image masks outlining regions of interest (ROIs) for differentiating breast lesions in breast ultrasound (US) imaging. Our semi-supervised learning approach utilizes a primitive model trained on a small public breast US dataset with true annotations. This model is then iteratively refined for the domain adaptation task, generating pseudo-masks for our private, unannotated breast US dataset.
arXiv Detail & Related papers (2024-04-18T18:25:00Z)
Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery [50.3022015601057]
We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video. We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets. Our results demonstrate the superiority of our approach compared to unstructured alternatives.
arXiv Detail & Related papers (2024-02-03T00:58:05Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip Segmentation in Robotic Surgeries [29.201385352740555]
We propose a novel visual-kinematics graph learning framework to accurately segment the instrument tip given various surgical procedures. Specifically, a graph learning framework is proposed to encode relational features of instrument parts from both image and kinematics. A cross-modal contrastive loss is designed to incorporate robust geometric prior from kinematics to image for tip segmentation.
arXiv Detail & Related papers (2023-09-02T14:52:58Z)
Data Augmentation-Based Unsupervised Domain Adaptation In Medical Imaging [0.709016563801433]
We propose an unsupervised method for robust domain adaptation in brain MRI segmentation by leveraging MRI-specific augmentation techniques. The results show that our proposed approach achieves high accuracy, exhibits broad applicability, and showcases remarkable robustness against domain shift in various tasks.
arXiv Detail & Related papers (2023-08-08T17:00:11Z)
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network. We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module. Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z)
FUN-SIS: a Fully UNsupervised approach for Surgical Instrument Segmentation [16.881624842773604]
We present FUN-SIS, a Fully-supervised approach for binary Surgical Instrument. We train a per-frame segmentation model on completely unlabelled endoscopic videos, by relying on implicit motion information and instrument shape-priors. The obtained fully-unsupervised results for surgical instrument segmentation are almost on par with the ones of fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2022-02-16T15:32:02Z)
Real-time landmark detection for precise endoscopic submucosal dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery. We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks. We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z)
Towards Unsupervised Learning for Instrument Segmentation in Robotic Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation. Our approach allows to train image segmentation models without the need to acquire expensive annotations. We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
Progressive Adversarial Semantic Segmentation [11.323677925193438]
Deep convolutional neural networks can perform exceedingly well given full supervision. The success of such fully-supervised models for various image analysis tasks is limited to the availability of massive amounts of labeled data. We propose a novel end-to-end medical image segmentation model, namely Progressive Adrial Semantic (PASS)
arXiv Detail & Related papers (2020-05-08T22:48:00Z)
Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights. It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness. In recent years, there has been a significant effort to automate the diagnosis using deep learning. This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN) Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.