Related papers: SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge

SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge

URL: http://arxiv.org/abs/2407.11906v1
Date: Tue, 16 Jul 2024 16:50:43 GMT
Title: SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
Authors: Hao Ding, Tuxun Lu, Yuqian Zhang, Ruixing Liang, Hongchao Shu, Lalithkumar Seenivasan, Yonghao Long, Qi Dou, Cong Gao, Mathias Unberath,
Abstract summary: Current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions. SegSTRONG-C challenge aims to promote the development of algorithms robust to unforeseen but plausible image corruptions of surgery. New benchmark will allow us to carefully study neural network robustness to non-adversarial corruptions of surgery.
Score: 20.63421118951673
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate segmentation of tools in robot-assisted surgery is critical for machine perception, as it facilitates numerous downstream tasks including augmented reality feedback. While current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions, these models have proven susceptible to even minor corruptions, significantly impairing the model's performance. This vulnerability is especially problematic in surgical settings where predictions might be used to inform high-stakes decisions. To better understand model behavior under non-adversarial corruptions, prior work has explored introducing artificial corruptions, like Gaussian noise or contrast perturbation to test set images, to assess model robustness. However, these corruptions are either not photo-realistic or model/task agnostic. Thus, these investigations provide limited insights into model deterioration under realistic surgical corruptions. To address this limitation, we introduce the SegSTRONG-C challenge that aims to promote the development of algorithms robust to unforeseen but plausible image corruptions of surgery, like smoke, bleeding, and low brightness. We collect and release corruption-free mock endoscopic video sequences for the challenge participants to train their algorithms and benchmark them on video sequences with photo-realistic non-adversarial corruptions for a binary robot tool segmentation task. This new benchmark will allow us to carefully study neural network robustness to non-adversarial corruptions of surgery, thus constituting an important first step towards more robust models for surgical computer vision. In this paper, we describe the data collection and annotation protocol, baseline evaluations of established segmentation models, and data augmentation-based techniques to enhance model robustness.

Related papers

Cancer-Net PCa-MultiSeg: Multimodal Enhancement of Prostate Cancer Lesion Segmentation Using Synthetic Correlated Diffusion Imaging [55.62977326180104]
Current deep learning approaches for prostate cancer lesion segmentation achieve limited performance.<n>We investigate synthetic correlated diffusion imaging (CDI$s$) as an enhancement to standard diffusion-based protocols.<n>Our results establish validated integration pathways for CDI$s$ as a practical drop-in enhancement for PCa lesion segmentation tasks.
arXiv Detail & Related papers (2025-11-11T04:16:12Z)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z)
A Study of Data Augmentation Techniques to Overcome Data Scarcity in Wound Classification using Deep Learning [0.0]
We show that data augmentation can improve classification performance, F1 scores, by up to 11% on top of state-of-the-art models. Our experiments with GAN based augmentation prove the viability of using DE-GANs to generate wound images with richer variations.
arXiv Detail & Related papers (2024-11-04T00:24:50Z)
MAPUNetR: A Hybrid Vision Transformer and U-Net Architecture for Efficient and Interpretable Medical Image Segmentation [0.0]
We introduce MAPUNetR, a novel architecture that synergizes the strengths of transformer models with the proven U-Net framework for medical image segmentation. Our model addresses the resolution preservation challenge and incorporates attention maps highlighting segmented regions, increasing accuracy and interpretability. Our experiments show that the model maintains stable performance and potential as a powerful tool for medical image segmentation in clinical practice.
arXiv Detail & Related papers (2024-10-29T16:52:57Z)
Towards Robust Algorithms for Surgical Phase Recognition via Digital Twin-based Scene Representation [14.108636146958007]
End-to-end trained neural networks that predict surgical phase directly from videos have shown excellent performance on benchmarks. Our goal is to improve model robustness to variations in the surgical videos by leveraging the digital twin (DT) paradigm. This approach takes advantage of the recent vision foundation models that ensure reliable low-level scene understanding.
arXiv Detail & Related papers (2024-10-26T00:49:06Z)
Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA) Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%. DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z)
Weakly Supervised Intracranial Hemorrhage Segmentation with YOLO and an Uncertainty Rectified Segment Anything Model [0.5578116134031106]
Intracranial hemorrhage (ICH) is a life-threatening condition that requires rapid and accurate diagnosis to improve treatment outcomes and patient survival rates. Recent advancements in supervised deep learning have greatly improved the analysis of medical images. To mitigate the need for large amounts of expert-prepared segmentation data, we have developed a novel weakly supervised ICH segmentation method.
arXiv Detail & Related papers (2024-07-29T23:40:13Z)
SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract. We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z)
Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness [47.9744734181236]
We explore the concept of Lipschitz continuity to certify the robustness of deep neural networks (DNNs) against adversarial attacks. We propose a novel algorithm that remaps the input domain into a constrained range, reducing the Lipschitz constant and potentially enhancing robustness. Our method achieves the best robust accuracy for CIFAR10, CIFAR100, and ImageNet datasets on the RobustBench leaderboard.
arXiv Detail & Related papers (2024-06-28T03:10:36Z)
Uncertainty modeling for fine-tuned implicit functions [10.902709236602536]
Implicit functions have become pivotal in computer vision for reconstructing detailed object shapes from sparse views. We introduce Dropsembles, a novel method for uncertainty estimation in tuned implicit functions. Our results show that Dropsembles achieve the accuracy and calibration levels of deep ensembles but with significantly less computational cost.
arXiv Detail & Related papers (2024-06-17T20:46:18Z)
Applying Dimensionality Reduction as Precursor to LSTM-CNN Models for Classifying Imagery and Motor Signals in ECoG-Based BCIs [0.0]
This research aims to elevate the field by optimizing motor imagery classification algorithms within Brain-Computer Interfaces (BCIs) We utilize unsupervised techniques for dimensionality reduction, namely Uniform Manifold Approximation and Projection (UMAP) and K-Nearest Neighbors (KNN) We also evaluate the necessity of employing supervised methods such as Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNNs) for classification tasks.
arXiv Detail & Related papers (2023-11-22T16:34:06Z)
GCS-ICHNet: Assessment of Intracerebral Hemorrhage Prognosis using Self-Attention with Domain Knowledge Integration [19.51978172091416]
Intracerebral Hemorrhage (ICH) is a severe condition resulting from damaged brain blood vessel ruptures. This paper introduces a novel deep learning algorithm, GCS-ICHNet, which integrates multimodal brain CT data and the Glasgow Coma Scale score to improve prognosis.
arXiv Detail & Related papers (2023-11-08T15:51:12Z)
Assessing and Enhancing Robustness of Deep Learning Models with Corruption Emulation in Digital Pathology [9.850335454350367]
We analyze the physical causes of the full-stack corruptions throughout the pathological life-cycle. We construct three OmniCE-corrupted benchmark datasets at both patch level and slide level. We explore to use the OmniCE-corrupted datasets as augmentation data for training and experiments to verify that the generalization ability of the models has been significantly enhanced.
arXiv Detail & Related papers (2023-10-31T12:59:36Z)
Automatic diagnosis of knee osteoarthritis severity using Swin transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint. We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z)
A Survey on the Robustness of Computer Vision Models against Common Corruptions [3.6486148851646063]
Computer vision models are susceptible to changes in input images caused by sensor errors or extreme imaging environments. These corruptions can significantly hinder the reliability of these models when deployed in real-world scenarios. We present a comprehensive overview of methods that improve the robustness of computer vision models against common corruptions.
arXiv Detail & Related papers (2023-05-10T10:19:31Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Towards to Robust and Generalized Medical Image Segmentation Framework [17.24628770042803]
We propose a novel two-stage framework for robust generalized segmentation. In particular, an unsupervised Tile-wise AutoEncoder (T-AE) pretraining architecture is coined to learn meaningful representation. Experiments of lung segmentation on multi chest X-ray datasets are conducted.
arXiv Detail & Related papers (2021-08-09T05:58:49Z)
On the Robustness of Pretraining and Self-Supervision for a Deep Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading. We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions. Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z)
Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations. We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z)
A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy. Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z)
Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
An Uncertainty-Driven GCN Refinement Strategy for Organ Segmentation [53.425900196763756]
We propose a segmentation refinement method based on uncertainty analysis and graph convolutional networks. We employ the uncertainty levels of the convolutional network in a particular input volume to formulate a semi-supervised graph learning problem. We show that our method outperforms the state-of-the-art CRF refinement method by improving the dice score by 1% for the pancreas and 2% for spleen.
arXiv Detail & Related papers (2020-12-06T18:55:07Z)
Towards Unsupervised Learning for Instrument Segmentation in Robotic Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation. Our approach allows to train image segmentation models without the need to acquire expensive annotations. We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
Towards Transferable Adversarial Attack against Deep Face Recognition [58.07786010689529]
Deep convolutional neural networks (DCNNs) have been found to be vulnerable to adversarial examples. transferable adversarial examples can severely hinder the robustness of DCNNs. We propose DFANet, a dropout-based method used in convolutional layers, which can increase the diversity of surrogate models. We generate a new set of adversarial face pairs that can successfully attack four commercial APIs without any queries.
arXiv Detail & Related papers (2020-04-13T06:44:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.