Related papers: A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation

A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation

URL: http://arxiv.org/abs/2512.02497v1
Date: Tue, 02 Dec 2025 07:40:42 GMT
Title: A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation
Authors: Wenjing Yu, Shuo Jiang, Yifei Chen, Shuo Chang, Yuanhan Wang, Beining Wu, Jie Dong, Mingxuan Liu, Shenghao Zhu, Feiwei Qin, Changmiao Wang, Qiyuan Tian,
Abstract summary: Test time adaptation is a promising approach for mitigating domain shift in medical image segmentation.<n>We present MedSeg-TTA, a comprehensive benchmark that examines twenty representative adaptation methods across seven imaging modalities.
Score: 18.147151439410383
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Test time Adaptation is a promising approach for mitigating domain shift in medical image segmentation; however, current evaluations remain limited in terms of modality coverage, task diversity, and methodological consistency. We present MedSeg-TTA, a comprehensive benchmark that examines twenty representative adaptation methods across seven imaging modalities, including MRI, CT, ultrasound, pathology, dermoscopy, OCT, and chest X-ray, under fully unified data preprocessing, backbone configuration, and test time protocols. The benchmark encompasses four significant adaptation paradigms: Input-level Transformation, Feature-level Alignment, Output-level Regularization, and Prior Estimation, enabling the first systematic cross-modality comparison of their reliability and applicability. The results show that no single paradigm performs best in all conditions. Input-level methods are more stable under mild appearance shifts. Feature-level and Output-level methods offer greater advantages in boundary-related metrics, whereas prior-based methods exhibit strong modality dependence. Several methods degrade significantly under large inter-center and inter-device shifts, which highlights the importance of principled method selection for clinical deployment. MedSeg-TTA provides standardized datasets, validated implementations, and a public leaderboard, establishing a rigorous foundation for future research on robust, clinically reliable test-time adaptation. All source codes and open-source datasets are available at https://github.com/wenjing-gg/MedSeg-TTA.

Related papers

MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation [55.37355146924576]
MedSeqFT is a sequential fine-tuning framework for medical image analysis.<n>It adapts pre-trained models to new tasks while refining their representational capacity.<n>It consistently outperforms state-of-the-art fine-tuning strategies.
arXiv Detail & Related papers (2025-09-07T15:22:53Z)
Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation [17.49123106322442]
Test-time adaptation (TTA) adjusts a learned model using unlabeled test data.<n>We incorporate morphological information and propose a framework based on multi-graph matching.<n>Our method outperforms other state-of-the-art approaches on two medical image segmentation benchmarks.
arXiv Detail & Related papers (2025-03-17T10:11:11Z)
Test-Time Modality Generalization for Medical Image Segmentation [0.9092907230570326]
Generalizable medical image segmentation is essential for ensuring consistent performance across diverse unseen clinical settings.<n>We introduce a novel Test-Time Modality Generalization (TTMG) framework, which comprises two core components: Modality-Aware Style Projection (MASP) and Modality-Sensitive Instance Whitening (MSIW)<n>MASP estimates the likelihood of a test instance belonging to each seen modality and maps it onto a distribution using modality-specific style bases, guiding its projection effectively.<n>MSIW is applied during training to selectively suppress modality-sensitive information while retaining modality-invariant features.
arXiv Detail & Related papers (2025-02-27T01:32:13Z)
PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation. Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process. Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z)
SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation On Diverse Modalities [50.6382396309597]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift.<n>We present a complete and fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment.<n>Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z)
Advancing UWF-SLO Vessel Segmentation with Source-Free Active Domain Adaptation and a Novel Multi-Center Dataset [11.494899967255142]
Accurate vessel segmentation in UWF-SLO images is crucial for diagnosing retinal diseases. manually labeling high-resolution UWF-SLO images is an extremely challenging, time-consuming and expensive task. This study introduces a pioneering framework that leverages a patch-based active domain adaptation approach.
arXiv Detail & Related papers (2024-06-19T15:49:06Z)
Multi Task Consistency Guided Source-Free Test-Time Domain Adaptation Medical Image Segmentation [8.591386126583748]
Source-free test-time adaptation for medical image segmentation aims to enhance the adaptability of segmentation models to diverse test sets of the target domain. Ensuring consistency between target edges and paired inputs is crucial for test-time adaptation. We propose a multi task consistency guided source-free test-time domain adaptation medical image segmentation method.
arXiv Detail & Related papers (2023-10-18T07:49:24Z)
On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts. We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z)
Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation. We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks. We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z)
Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency [24.78258331561847]
This paper presents a novel scheme to learn the mutual benefits of different modalities to achieve better segmentation results for unpaired medical images. We leverage a carefully designed External Attention Module (EAM) to align semantic class representations and their correlations of different modalities. We have demonstrated the effectiveness of the proposed method on two medical image segmentation scenarios.
arXiv Detail & Related papers (2022-06-21T17:50:29Z)
DLTTA: Dynamic Learning Rate for Test-time Adaptation on Cross-domain Medical Images [56.72015587067494]
We propose a novel dynamic learning rate adjustment method for test-time adaptation, called DLTTA. Our method achieves effective and fast test-time adaptation with consistent performance improvement over current state-of-the-art test-time adaptation methods.
arXiv Detail & Related papers (2022-05-27T02:34:32Z)
Cross-Domain Segmentation with Adversarial Loss and Covariate Shift for Biomedical Imaging [2.1204495827342438]
This manuscript aims to implement a novel model that can learn robust representations from cross-domain data by encapsulating distinct and shared patterns from different modalities. The tests on CT and MRI liver data acquired in routine clinical trials show that the proposed model outperforms all other baseline with a large margin.
arXiv Detail & Related papers (2020-06-08T07:35:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.