Motion-Boundary-Driven Unsupervised Surgical Instrument Segmentation in Low-Quality Optical Flow
- URL: http://arxiv.org/abs/2403.10039v2
- Date: Tue, 25 Mar 2025 20:18:43 GMT
- Title: Motion-Boundary-Driven Unsupervised Surgical Instrument Segmentation in Low-Quality Optical Flow
- Authors: Yang Liu, Peiran Wu, Jiayu Huo, Gongyu Zhang, Zhen Yuan, Christos Bergeles, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin,
- Abstract summary: Unsupervised video-based surgical instrument segmentation has the potential to accelerate the adoption of robot-assisted procedures.<n>The generally low quality of optical flow in endoscopic footage poses a great challenge for unsupervised methods that rely heavily on motion cues.<n>We propose a novel approach that pinpoints motion boundaries, regions with abrupt flow changes, while selectively discarding frames with globally low-quality flow.
- Score: 42.75298102809838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised video-based surgical instrument segmentation has the potential to accelerate the adoption of robot-assisted procedures by reducing the reliance on manual annotations. However, the generally low quality of optical flow in endoscopic footage poses a great challenge for unsupervised methods that rely heavily on motion cues. To overcome this limitation, we propose a novel approach that pinpoints motion boundaries, regions with abrupt flow changes, while selectively discarding frames with globally low-quality flow and adapting to varying motion patterns. Experiments on the EndoVis2017 VOS and EndoVis2017 Challenge datasets show that our method achieves mean Intersection-over-Union (mIoU) scores of 0.75 and 0.72, respectively, effectively alleviating the constraints imposed by suboptimal optical flow. This enables a more scalable and robust surgical instrument segmentation solution in clinical settings. The code will be publicly released.
Related papers
- Markerless Tracking-Based Registration for Medical Image Motion Correction [0.4288177321445912]
This study focuses on isolating swallowing dynamics from interfering patient motion in videofluoroscopy.
Optical flow methods fail due to artifacts like flickering and instability, making them unreliable for distinguishing different motion groups.
We introduce a novel motion correction pipeline that effectively removes disruptive motion while preserving swallowing dynamics.
arXiv Detail & Related papers (2025-03-13T11:18:50Z) - One Patient's Annotation is Another One's Initialization: Towards Zero-Shot Surgical Video Segmentation with Cross-Patient Initialization [1.0536099636804035]
Video object segmentation is an emerging technology that is well-suited for real-time surgical video segmentation.
However, its adoption is limited by the need for manual intervention to select the tracked object.
In this work, we tackle this challenge with an innovative solution: using previously annotated frames from other patients as the tracking frames.
We find that this unconventional approach can match or even surpass the performance of using patients' own tracking frames.
arXiv Detail & Related papers (2025-03-04T03:11:03Z) - AMNCutter: Affinity-Attention-Guided Multi-View Normalized Cutter for Unsupervised Surgical Instrument Segmentation [7.594796294925481]
We propose a label-free unsupervised model featuring a novel module named Multi-View Normalized Cutter (m-NCutter)
Our model is trained using a graph-cutting loss function that leverages patch affinities for supervision, eliminating the need for pseudo-labels.
We conduct comprehensive experiments across multiple SIS datasets to validate our approach's state-of-the-art (SOTA) performance, robustness, and exceptional potential as a pre-trained model.
arXiv Detail & Related papers (2024-11-06T06:33:55Z) - Tracking Everything in Robotic-Assisted Surgery [39.62251870446397]
We present an annotated surgical tracking dataset for benchmarking tracking methods for surgical scenarios.
We evaluate state-of-the-art (SOTA) TAP-based algorithms on this dataset and reveal their limitations in challenging surgical scenarios.
We propose a new tracking method, namely SurgMotion, to solve the challenges and further improve the tracking performance.
arXiv Detail & Related papers (2024-09-29T23:06:57Z) - Revisiting Surgical Instrument Segmentation Without Human Intervention: A Graph Partitioning View [7.594796294925481]
We propose an unsupervised method by reframing the video frame segmentation as a graph partitioning problem.
A self-supervised pre-trained model is firstly leveraged as a feature extractor to capture high-level semantic features.
On the "deep" eigenvectors, a surgical video frame is meaningfully segmented into different modules like tools and tissues, providing distinguishable semantic information.
arXiv Detail & Related papers (2024-08-27T05:31:30Z) - SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge [20.63421118951673]
Current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions.
SegSTRONG-C challenge aims to promote the development of algorithms robust to unforeseen but plausible image corruptions of surgery.
New benchmark will allow us to carefully study neural network robustness to non-adversarial corruptions of surgery.
arXiv Detail & Related papers (2024-07-16T16:50:43Z) - Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring [71.60457491155451]
Eliminating image blur produced by various kinds of motion has been a challenging problem.
We propose a novel real-world deblurring filtering model called the Motion-adaptive Separable Collaborative Filter.
Our method provides an effective solution for real-world motion blur removal and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-04-19T19:44:24Z) - OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation [55.676358801492114]
We propose OCAI, a method that supports robust frame ambiguities by generating intermediate video frames alongside optical flows in between.
Our evaluations demonstrate superior quality and enhanced optical flow accuracy on established benchmarks such as Sintel and KITTI.
arXiv Detail & Related papers (2024-03-26T20:23:48Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Learning Task-Oriented Flows to Mutually Guide Feature Alignment in
Synthesized and Real Video Denoising [137.5080784570804]
Video denoising aims at removing noise from videos to recover clean ones.
Some existing works show that optical flow can help the denoising by exploiting the additional spatial-temporal clues from nearby frames.
We propose a new multi-scale refined optical flow-guided video denoising method, which is more robust to different noise levels.
arXiv Detail & Related papers (2022-08-25T00:09:18Z) - Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical
Scene Segmentation with Limited Annotations [72.15956198507281]
We propose PGV-CL, a novel pseudo-label guided cross-video contrast learning method to boost scene segmentation.
We extensively evaluate our method on a public robotic surgery dataset EndoVis18 and a public cataract dataset CaDIS.
arXiv Detail & Related papers (2022-07-20T05:42:19Z) - EM-driven unsupervised learning for efficient motion segmentation [3.5232234532568376]
This paper presents a CNN-based fully unsupervised method for motion segmentation from optical flow.
We use the Expectation-Maximization (EM) framework to leverage the loss function and the training procedure of our motion segmentation neural network.
Our method outperforms comparable unsupervised methods and is very efficient.
arXiv Detail & Related papers (2022-01-06T14:35:45Z) - Unsupervised Motion Representation Enhanced Network for Action
Recognition [4.42249337449125]
Motion representation between consecutive frames has proven to have great promotion to video understanding.
TV-L1 method, an effective optical flow solver, is time-consuming and expensive in storage for caching the extracted optical flow.
We propose UF-TSN, a novel end-to-end action recognition approach enhanced with an embedded lightweight unsupervised optical flow estimator.
arXiv Detail & Related papers (2021-03-05T04:14:32Z) - Weakly-supervised Learning For Catheter Segmentation in 3D Frustum
Ultrasound [74.22397862400177]
We propose a novel Frustum ultrasound based catheter segmentation method.
The proposed method achieved the state-of-the-art performance with an efficiency of 0.25 second per volume.
arXiv Detail & Related papers (2020-10-19T13:56:22Z) - Learning Motion Flows for Semi-supervised Instrument Segmentation from
Robotic Surgical Video [64.44583693846751]
We study the semi-supervised instrument segmentation from robotic surgical videos with sparse annotations.
By exploiting generated data pairs, our framework can recover and even enhance temporal consistency of training sequences.
Results show that our method outperforms the state-of-the-art semisupervised methods by a large margin.
arXiv Detail & Related papers (2020-07-06T02:39:32Z) - What Matters in Unsupervised Optical Flow [51.45112526506455]
We compare and analyze a set of key components in unsupervised optical flow.
We construct a number of novel improvements to unsupervised flow models.
We present a new unsupervised flow technique that significantly outperforms the previous state-of-the-art.
arXiv Detail & Related papers (2020-06-08T19:36:26Z) - Joint Unsupervised Learning of Optical Flow and Egomotion with Bi-Level
Optimization [59.9673626329892]
We exploit the global relationship between optical flow and camera motion using epipolar geometry.
We use implicit differentiation to enable back-propagation through the lower-level geometric optimization layer independent of its implementation.
arXiv Detail & Related papers (2020-02-26T22:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.