SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
- URL: http://arxiv.org/abs/2407.11906v1
- Date: Tue, 16 Jul 2024 16:50:43 GMT
- Title: SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
- Authors: Hao Ding, Tuxun Lu, Yuqian Zhang, Ruixing Liang, Hongchao Shu, Lalithkumar Seenivasan, Yonghao Long, Qi Dou, Cong Gao, Mathias Unberath,
- Abstract summary: Current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions.
SegSTRONG-C challenge aims to promote the development of algorithms robust to unforeseen but plausible image corruptions of surgery.
New benchmark will allow us to carefully study neural network robustness to non-adversarial corruptions of surgery.
- Score: 20.63421118951673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate segmentation of tools in robot-assisted surgery is critical for machine perception, as it facilitates numerous downstream tasks including augmented reality feedback. While current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions, these models have proven susceptible to even minor corruptions, significantly impairing the model's performance. This vulnerability is especially problematic in surgical settings where predictions might be used to inform high-stakes decisions. To better understand model behavior under non-adversarial corruptions, prior work has explored introducing artificial corruptions, like Gaussian noise or contrast perturbation to test set images, to assess model robustness. However, these corruptions are either not photo-realistic or model/task agnostic. Thus, these investigations provide limited insights into model deterioration under realistic surgical corruptions. To address this limitation, we introduce the SegSTRONG-C challenge that aims to promote the development of algorithms robust to unforeseen but plausible image corruptions of surgery, like smoke, bleeding, and low brightness. We collect and release corruption-free mock endoscopic video sequences for the challenge participants to train their algorithms and benchmark them on video sequences with photo-realistic non-adversarial corruptions for a binary robot tool segmentation task. This new benchmark will allow us to carefully study neural network robustness to non-adversarial corruptions of surgery, thus constituting an important first step towards more robust models for surgical computer vision. In this paper, we describe the data collection and annotation protocol, baseline evaluations of established segmentation models, and data augmentation-based techniques to enhance model robustness.
Related papers
- MAPUNetR: A Hybrid Vision Transformer and U-Net Architecture for Efficient and Interpretable Medical Image Segmentation [0.0]
We introduce MAPUNetR, a novel architecture that synergizes the strengths of transformer models with the proven U-Net framework for medical image segmentation.
Our model addresses the resolution preservation challenge and incorporates attention maps highlighting segmented regions, increasing accuracy and interpretability.
Our experiments show that the model maintains stable performance and potential as a powerful tool for medical image segmentation in clinical practice.
arXiv Detail & Related papers (2024-10-29T16:52:57Z) - Towards Robust Algorithms for Surgical Phase Recognition via Digital Twin-based Scene Representation [14.108636146958007]
End-to-end trained neural networks that predict surgical phase directly from videos have shown excellent performance on benchmarks.
Our goal is to improve model robustness to variations in the surgical videos by leveraging the digital twin (DT) paradigm.
This approach takes advantage of the recent vision foundation models that ensure reliable low-level scene understanding.
arXiv Detail & Related papers (2024-10-26T00:49:06Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Uncertainty modeling for fine-tuned implicit functions [10.902709236602536]
Implicit functions have become pivotal in computer vision for reconstructing detailed object shapes from sparse views.
We introduce Dropsembles, a novel method for uncertainty estimation in tuned implicit functions.
Our results show that Dropsembles achieve the accuracy and calibration levels of deep ensembles but with significantly less computational cost.
arXiv Detail & Related papers (2024-06-17T20:46:18Z) - A Survey on the Robustness of Computer Vision Models against Common Corruptions [3.6486148851646063]
Computer vision models are susceptible to changes in input images caused by sensor errors or extreme imaging environments.
These corruptions can significantly hinder the reliability of these models when deployed in real-world scenarios.
We present a comprehensive overview of methods that improve the robustness of computer vision models against common corruptions.
arXiv Detail & Related papers (2023-05-10T10:19:31Z) - Towards to Robust and Generalized Medical Image Segmentation Framework [17.24628770042803]
We propose a novel two-stage framework for robust generalized segmentation.
In particular, an unsupervised Tile-wise AutoEncoder (T-AE) pretraining architecture is coined to learn meaningful representation.
Experiments of lung segmentation on multi chest X-ray datasets are conducted.
arXiv Detail & Related papers (2021-08-09T05:58:49Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks.
We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem.
We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.