Related papers: RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

URL: http://arxiv.org/abs/2512.02188v1
Date: Mon, 01 Dec 2025 20:31:05 GMT
Title: RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation
Authors: Mansoor Ali, Maksim Richards, Gilberto Ochoa-Ruiz, Sharib Ali,
Abstract summary: Current literature for tackling generalisation on out-of-distribution data and domain gaps due to modality changes has been widely researched but mostly for natural scene data.<n>Inspired by these works in natural scenes to push generalisability on OOD data, we hypothesise that exploiting the style and content information in the surgical scenes could minimise the appearances.
Score: 7.9220056699521875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While recent advances in deep learning for surgical scene segmentation have demonstrated promising results on single-centre and single-imaging modality data, these methods usually do not generalise to unseen distribution (i.e., from other centres) and unseen modalities. Current literature for tackling generalisation on out-of-distribution data and domain gaps due to modality changes has been widely researched but mostly for natural scene data. However, these methods cannot be directly applied to the surgical scenes due to limited visual cues and often extremely diverse scenarios compared to the natural scene data. Inspired by these works in natural scenes to push generalisability on OOD data, we hypothesise that exploiting the style and content information in the surgical scenes could minimise the appearances, making it less variable to sudden changes such as blood or imaging artefacts. This can be achieved by performing instance normalisation and feature covariance mapping techniques for robust and generalisable feature representations. Further, to eliminate the risk of removing salient feature representation associated with the objects of interest, we introduce a restitution module within the feature learning ResNet backbone that can enable the retention of useful task-relevant features. To tackle the lack of multiclass and multicentre data for surgical scene segmentation, we also provide a newly curated dataset that can be vital for addressing generalisability in this domain. Our proposed RobustSurg obtained nearly 23% improvement on the baseline DeepLabv3+ and from 10-32% improvement on the SOTA in terms of mean IoU score on an unseen centre HeiCholSeg dataset when trained on CholecSeg8K. Similarly, RobustSurg also obtained nearly 22% improvement over the baseline and nearly 11% improvement on a recent SOTA method for the target set of the EndoUDA polyp dataset.

Related papers

A Dataset for Semantic Segmentation in the Presence of Unknowns [49.795683850385956]
Existing datasets allow evaluation of only knowns or unknowns - but not both.<n>We propose a novel anomaly segmentation dataset, ISSU, that features a diverse set of anomaly inputs from cluttered real-world environments.<n>The dataset is twice larger than existing anomaly segmentation datasets.
arXiv Detail & Related papers (2025-03-28T10:31:01Z)
Tackling domain generalization for out-of-distribution endoscopic imaging [1.6377635288143584]
We exploit both style and content information in images to preserve robust and generalizable feature representations. Our proposed method shows a 13.7% improvement over the baseline DeepLabv3+ and nearly an 8% improvement over recent state-of-the-art (SOTA) methods for the target (different modality) set of the EndoUDA polyp dataset.
arXiv Detail & Related papers (2024-10-18T18:45:13Z)
Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images [67.66644395272075]
We present first analysis of state-of-the-art semantic segmentation models when faced with geometric out-of-distribution data. We propose an augmentation technique called "Organ Transplantation" to enhance generalizability. Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2024-08-27T19:13:15Z)
Dual-Reference Source-Free Active Domain Adaptation for Nasopharyngeal Carcinoma Tumor Segmentation across Multiple Hospitals [9.845637899896365]
Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant malignancy that predominantly impacts the head and neck area. We propose a novel Sourece-Free Active Domain Adaptation (SFADA) framework to facilitate domain adaptation for the Gross Tumor Volume (GTV) segmentation task. We collect a large-scale clinical dataset comprising 1057 NPC patients from five hospitals to validate our approach.
arXiv Detail & Related papers (2023-09-23T15:26:27Z)
Cross-Dataset Adaptation for Instrument Classification in Cataract Surgery Videos [54.1843419649895]
State-of-the-art models, which perform this task well on a particular dataset, perform poorly when tested on another dataset. We propose a novel end-to-end Unsupervised Domain Adaptation (UDA) method called the Barlow Adaptor. In addition, we introduce a novel loss called the Barlow Feature Alignment Loss (BFAL) which aligns features across different domains.
arXiv Detail & Related papers (2023-07-31T18:14:18Z)
Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z)
Domain Generalization with Adversarial Intensity Attack for Medical Image Segmentation [27.49427483473792]
In real-world scenarios, it is common for models to encounter data from new and different domains to which they were not exposed to during training. domain generalization (DG) is a promising direction as it enables models to handle data from previously unseen domains. We introduce a novel DG method called Adversarial Intensity Attack (AdverIN), which leverages adversarial training to generate training data with an infinite number of styles.
arXiv Detail & Related papers (2023-04-05T19:40:51Z)
Semantic segmentation of surgical hyperspectral images under geometric domain shifts [69.91792194237212]
We present the first analysis of state-of-the-art semantic segmentation networks in the presence of geometric out-of-distribution (OOD) data. We also address generalizability with a dedicated augmentation technique termed "Organ Transplantation" Our scheme improves on the SOA DSC by up to 67 % (RGB) and 90 % (HSI) and renders performance on par with in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2023-03-20T09:50:07Z)
Multi-scale Feature Alignment for Continual Learning of Unlabeled Domains [3.9498537297431167]
generative feature-driven image replay in conjunction with a dual-purpose discriminator enables the generation of images with realistic features for replay. We present detailed ablation experiments studying our proposed method components and demonstrate a possible use-case of our continual UDA method for an unsupervised patch-based segmentation task.
arXiv Detail & Related papers (2023-02-02T18:19:01Z)
AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation [1.0452185327816181]
We propose a data manipulation based domain generalization method, called Automated Augmentation for Domain Generalization (AADG) Our AADG framework can effectively sample data augmentation policies that generate novel domains. Our proposed AADG exhibits state-of-the-art generalization performance and outperforms existing approaches.
arXiv Detail & Related papers (2022-07-27T02:26:01Z)
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community. The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.