AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models
- URL: http://arxiv.org/abs/2505.17931v1
- Date: Fri, 23 May 2025 14:07:21 GMT
- Title: AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models
- Authors: Xingjian Li, Qifeng Wu, Colleen Que, Yiran Ding, Adithya S. Ubaradka, Jianhua Xing, Tianyang Wang, Min Xu,
- Abstract summary: This paper introduces a zero-shot and automatic segmentation pipeline that combines vision-language and segmentation foundation models.<n>By proper decomposition and test-time adaptation, our fully automatic pipeline performs competitively with weakly-prompted interactive foundation models.
- Score: 7.382887784956608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical image segmentation is vital for clinical diagnosis, yet current deep learning methods often demand extensive expert effort, i.e., either through annotating large training datasets or providing prompts at inference time for each new case. This paper introduces a zero-shot and automatic segmentation pipeline that combines off-the-shelf vision-language and segmentation foundation models. Given a medical image and a task definition (e.g., "segment the optic disc in an eye fundus image"), our method uses a grounding model to generate an initial bounding box, followed by a visual prompt boosting module that enhance the prompts, which are then processed by a promptable segmentation model to produce the final mask. To address the challenges of domain gap and result verification, we introduce a test-time adaptation framework featuring a set of learnable adaptors that align the medical inputs with foundation model representations. Its hyperparameters are optimized via Bayesian Optimization, guided by a proxy validation model without requiring ground-truth labels. Our pipeline offers an annotation-efficient and scalable solution for zero-shot medical image segmentation across diverse tasks. Our pipeline is evaluated on seven diverse medical imaging datasets and shows promising results. By proper decomposition and test-time adaptation, our fully automatic pipeline performs competitively with weakly-prompted interactive foundation models.
Related papers
- PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - Towards Universal Text-driven CT Image Segmentation [4.76971404389011]
We propose OpenVocabCT, a vision-language model pretrained on large-scale 3D CT images for universal text-driven segmentation.<n>We decompose the diagnostic reports into fine-grained, organ-level descriptions using large language models for multi-granular contrastive learning.
arXiv Detail & Related papers (2025-03-08T03:02:57Z) - Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation [30.524999223901645]
We propose an enhanced Segment Anything Model (SAM) framework that utilizes annotation-efficient prompts generated in a fully unsupervised fashion.<n>We adopt the direct preference optimization technique to design an optimal policy that enables the model to generate high-fidelity segmentations.<n>State-of-the-art performance of our framework in tasks such as lung segmentation, breast tumor segmentation, and organ segmentation across various modalities, including X-ray, ultrasound, and abdominal CT, justifies its effectiveness in low-annotation data scenarios.
arXiv Detail & Related papers (2025-03-06T17:28:48Z) - MedicoSAM: Towards foundation models for medical image segmentation [2.6579756198224347]
We show how to improve Segment Anything for medical images by comparing different finetuning strategies on a large and diverse dataset.<n>We find that the performance can be clearly improved for interactive segmentation.<n>Our best model, MedicoSAM, is publicly available at https://github.com/computational-cell-analytics/medico-sam.
arXiv Detail & Related papers (2025-01-20T20:40:28Z) - Prompting Segment Anything Model with Domain-Adaptive Prototype for Generalizable Medical Image Segmentation [49.5901368256326]
We propose a novel Domain-Adaptive Prompt framework for fine-tuning the Segment Anything Model (termed as DAPSAM) in segmenting medical images.
Our DAPSAM achieves state-of-the-art performance on two medical image segmentation tasks with different modalities.
arXiv Detail & Related papers (2024-09-19T07:28:33Z) - A Simple and Robust Framework for Cross-Modality Medical Image
Segmentation applied to Vision Transformers [0.0]
We propose a simple framework to achieve fair image segmentation of multiple modalities using a single conditional model.
We show that our framework outperforms other cross-modality segmentation methods on the Multi-Modality Whole Heart Conditional Challenge.
arXiv Detail & Related papers (2023-10-09T09:51:44Z) - Self-Supervised and Semi-Supervised Polyp Segmentation using Synthetic
Data [16.356954231068077]
Early detection of colorectal polyps is of utmost importance for their treatment and for colorectal cancer prevention.
Computer vision techniques have the potential to aid professionals in the diagnosis stage, where colonoscopies are manually carried out to examine the entirety of the patient's colon.
The main challenge in medical imaging is the lack of data, and a further challenge specific to polyp segmentation approaches is the difficulty of manually labeling the available data.
We propose an end-to-end model for polyp segmentation that integrates real and synthetic data to artificially increase the size of the datasets and aid the training when unlabeled samples are available.
arXiv Detail & Related papers (2023-07-22T09:57:58Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Generalize Ultrasound Image Segmentation via Instant and Plug & Play
Style Transfer [65.71330448991166]
Deep segmentation models generalize to images with unknown appearance.
Retraining models leads to high latency and complex pipelines.
We propose a novel method for robust segmentation under unknown appearance shifts.
arXiv Detail & Related papers (2021-01-11T05:45:30Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.