Show and Segment: Universal Medical Image Segmentation via In-Context Learning
- URL: http://arxiv.org/abs/2503.19359v1
- Date: Tue, 25 Mar 2025 05:26:10 GMT
- Title: Show and Segment: Universal Medical Image Segmentation via In-Context Learning
- Authors: Yunhe Gao, Di Liu, Zhuowei Li, Yunsheng Li, Dongdong Chen, Mu Zhou, Dimitris N. Metaxas,
- Abstract summary: We present Iris, a novel In-context Reference Image guided framework for medical image segmentation.<n>At its core, Iris features a lightweight context task encoding module that distills task-specific information from reference context image-label pairs.<n>By task encoding, Iris supports diverse strategies from one-shot inference and context example ensemble to object-level context example retrieval and in-context.
- Score: 43.494896215216684
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical image segmentation remains challenging due to the vast diversity of anatomical structures, imaging modalities, and segmentation tasks. While deep learning has made significant advances, current approaches struggle to generalize as they require task-specific training or fine-tuning on unseen classes. We present Iris, a novel In-context Reference Image guided Segmentation framework that enables flexible adaptation to novel tasks through the use of reference examples without fine-tuning. At its core, Iris features a lightweight context task encoding module that distills task-specific information from reference context image-label pairs. This rich context embedding information is used to guide the segmentation of target objects. By decoupling task encoding from inference, Iris supports diverse strategies from one-shot inference and context example ensemble to object-level context example retrieval and in-context tuning. Through comprehensive evaluation across twelve datasets, we demonstrate that Iris performs strongly compared to task-specific models on in-distribution tasks. On seven held-out datasets, Iris shows superior generalization to out-of-distribution data and unseen classes. Further, Iris's task encoding module can automatically discover anatomical relationships across datasets and modalities, offering insights into medical objects without explicit anatomical supervision.
Related papers
- A Simple Image Segmentation Framework via In-Context Examples [59.319920526160466]
We present SINE, a simple image framework utilizing in-context examples.
We introduce an In-context Interaction module to complement in-context information and produce correlations between the target image and the in-context example.
Experiments on various segmentation tasks show the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-10-07T08:59:05Z) - Visual Prompt Selection for In-Context Learning Segmentation [77.15684360470152]
In this paper, we focus on rethinking and improving the example selection strategy.
We first demonstrate that ICL-based segmentation models are sensitive to different contexts.
Furthermore, empirical evidence indicates that the diversity of contextual prompts plays a crucial role in guiding segmentation.
arXiv Detail & Related papers (2024-07-14T15:02:54Z) - Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources.
We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs.
Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z) - Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation [19.987706084203523]
We propose Panoptic Perception, a novel task and a new fine-grained dataset (FineGrip) to achieve a more thorough and universal interpretation for RSIs.
The new task integrates pixel-level, instance-level, and image-level information for universal image perception.
FineGrip dataset includes 2,649 remote sensing images, 12,054 fine-grained instance segmentation masks belonging to 20 foreground things categories, 7,599 background semantic masks for 5 stuff classes and 13,245 captioning sentences.
arXiv Detail & Related papers (2024-04-06T12:27:21Z) - Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning [65.54680361074882]
Eye-gaze Guided Multi-modal Alignment (EGMA) framework harnesses eye-gaze data for better alignment of medical visual and textual features.
We conduct downstream tasks of image classification and image-text retrieval on four medical datasets.
arXiv Detail & Related papers (2024-03-19T03:59:14Z) - Kartezio: Evolutionary Design of Explainable Pipelines for Biomedical
Image Analysis [0.0]
We introduce Kartezio, a computational strategy that generates transparent and easily interpretable image processing pipelines.
The pipelines thus generated exhibit comparable precision to state-of-the-art Deep Learning approaches on instance segmentation tasks.
We also deployed Kartezio to solve semantic and instance segmentation problems in four real-world Use Cases.
arXiv Detail & Related papers (2023-02-28T17:02:35Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.