SegICL: A Multimodal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
- URL: http://arxiv.org/abs/2403.16578v4
- Date: Thu, 30 May 2024 03:35:06 GMT
- Title: SegICL: A Multimodal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
- Authors: Lingdong Shen, Fangxin Shang, Xiaoshuang Huang, Yehui Yang, Haifeng Huang, Shiming Xiang,
- Abstract summary: We introduce SegICL, a novel approach leveraging In-Context Learning (ICL) for image segmentation.
SegICL has the capability to employ text-guided segmentation and conduct in-context learning with a small set of image-mask pairs.
The performance of segmentation when provided thre-shots is approximately 1.5 times better than the performance in a zero-shot setting.
- Score: 24.32438479339158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of medical image segmentation, tackling Out-of-Distribution (OOD) segmentation tasks in a cost-effective manner remains a significant challenge. Universal segmentation models is a solution, which aim to generalize across the diverse modality of medical images, yet their effectiveness often diminishes when applied to OOD data modalities and tasks, requiring intricate fine-tuning of model for optimal performance. Few-shot learning segmentation methods are typically designed for specific modalities of data and cannot be directly transferred for use with another modality. Therefore, we introduce SegICL, a novel approach leveraging In-Context Learning (ICL) for image segmentation. Unlike existing methods, SegICL has the capability to employ text-guided segmentation and conduct in-context learning with a small set of image-mask pairs, eliminating the need for training the model from scratch or fine-tuning for OOD tasks (including OOD modality and dataset). Extensive experimental demonstrates a positive correlation between the number of shots and segmentation performance on OOD tasks. The performance of segmentation when provided thre-shots is approximately 1.5 times better than the performance in a zero-shot setting. This indicates that SegICL effectively address new segmentation tasks based on contextual information. Additionally, SegICL also exhibits comparable performance to mainstream models on OOD and in-distribution tasks. Our code will be released after paper review.
Related papers
- PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - Cross Prompting Consistency with Segment Anything Model for Semi-supervised Medical Image Segmentation [44.54301473673582]
Semi-supervised learning (SSL) has achieved notable progress in medical image segmentation.
Recent developments in visual foundation models, such as the Segment Anything Model (SAM), have demonstrated remarkable adaptability.
We propose a cross-prompting consistency method with segment anything model (CPC-SAM) for semi-supervised medical image segmentation.
arXiv Detail & Related papers (2024-07-07T15:43:20Z) - Explore In-Context Segmentation via Latent Diffusion Models [132.26274147026854]
latent diffusion model (LDM) is an effective minimalist for in-context segmentation.
We build a new and fair in-context segmentation benchmark that includes both image and video datasets.
arXiv Detail & Related papers (2024-03-14T17:52:31Z) - SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation [53.010417880335424]
Semi-supervised temporal action segmentation (SS-TA) aims to perform frame-wise classification in long untrimmed videos.
Recent studies have shown the potential of contrastive learning in unsupervised representation learning using unlabelled data.
We propose a novel Semantic-guided Multi-level Contrast scheme with a Neighbourhood-Consistency-Aware unit (SMC-NCA) to extract strong frame-wise representations.
arXiv Detail & Related papers (2023-12-19T17:26:44Z) - Interpretable Small Training Set Image Segmentation Network Originated
from Multi-Grid Variational Model [5.283735137946097]
Deep learning (DL) methods have been proposed and widely used for image segmentation.
DL methods usually require a large amount of manually segmented data as training data and suffer from poor interpretability.
In this paper, we replace the hand-crafted regularity term in the MS model with a data adaptive generalized learnable regularity term.
arXiv Detail & Related papers (2023-06-25T02:34:34Z) - Exploring Open-Vocabulary Semantic Segmentation without Human Labels [76.15862573035565]
We present ZeroSeg, a novel method that leverages the existing pretrained vision-language model (VL) to train semantic segmentation models.
ZeroSeg overcomes this by distilling the visual concepts learned by VL models into a set of segment tokens, each summarizing a localized region of the target image.
Our approach achieves state-of-the-art performance when compared to other zero-shot segmentation methods under the same training data.
arXiv Detail & Related papers (2023-06-01T08:47:06Z) - IDEAL: Improved DEnse locAL Contrastive Learning for Semi-Supervised
Medical Image Segmentation [3.6748639131154315]
We extend the concept of metric learning to the segmentation task.
We propose a simple convolutional projection head for obtaining dense pixel-level features.
A bidirectional regularization mechanism involving two-stream regularization training is devised for the downstream task.
arXiv Detail & Related papers (2022-10-26T23:11:02Z) - Data-Limited Tissue Segmentation using Inpainting-Based Self-Supervised
Learning [3.7931881761831328]
Self-supervised learning (SSL) methods involving pretext tasks have shown promise in overcoming this requirement by first pretraining models using unlabeled data.
We evaluate the efficacy of two SSL methods (inpainting-based pretext tasks of context prediction and context restoration) for CT and MRI image segmentation in label-limited scenarios.
We demonstrate that optimally trained and easy-to-implement SSL segmentation models can outperform classically supervised methods for MRI and CT tissue segmentation in label-limited scenarios.
arXiv Detail & Related papers (2022-10-14T16:34:05Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - Towards Robust Partially Supervised Multi-Structure Medical Image
Segmentation on Small-Scale Data [123.03252888189546]
We propose Vicinal Labels Under Uncertainty (VLUU) to bridge the methodological gaps in partially supervised learning (PSL) under data scarcity.
Motivated by multi-task learning and vicinal risk minimization, VLUU transforms the partially supervised problem into a fully supervised problem by generating vicinal labels.
Our research suggests a new research direction in label-efficient deep learning with partial supervision.
arXiv Detail & Related papers (2020-11-28T16:31:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.