PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation
- URL: http://arxiv.org/abs/2308.13746v1
- Date: Sat, 26 Aug 2023 03:11:48 GMT
- Title: PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation
- Authors: Ao Chang, Xing Tao, Xin Yang, Yuhao Huang, Xinrui Zhou, Jiajun Zeng,
Ruobing Huang, Dong Ni
- Abstract summary: We introduce a novel framework equipped with prompt enhancement, called PE-MED, for interactive medical image segmentation.
First, we introduce a Self-Loop strategy to generate warm initial segmentation results based on the first prompt.
Second, we propose a novel Prompt Attention Learning Module (PALM) to mine useful prompt information in one interaction.
- Score: 9.744164910887223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactive medical image segmentation refers to the accurate segmentation of
the target of interest through interaction (e.g., click) between the user and
the image. It has been widely studied in recent years as it is less dependent
on abundant annotated data and more flexible than fully automated segmentation.
However, current studies have not fully explored user-provided prompt
information (e.g., points), including the knowledge mined in one interaction,
and the relationship between multiple interactions. Thus, in this paper, we
introduce a novel framework equipped with prompt enhancement, called PE-MED,
for interactive medical image segmentation. First, we introduce a Self-Loop
strategy to generate warm initial segmentation results based on the first
prompt. It can prevent the highly unfavorable scenarios, such as encountering a
blank mask as the initial input after the first interaction. Second, we propose
a novel Prompt Attention Learning Module (PALM) to mine useful prompt
information in one interaction, enhancing the responsiveness of the network to
user clicks. Last, we build a Time Series Information Propagation (TSIP)
mechanism to extract the temporal relationships between multiple interactions
and increase the model stability. Comparative experiments with other
state-of-the-art (SOTA) medical image segmentation algorithms show that our
method exhibits better segmentation accuracy and stability.
Related papers
- TP-UNet: Temporal Prompt Guided UNet for Medical Image Segmentation [11.207258450032205]
The order of organs in scanned images has been disregarded by current medical image segmentation approaches based on UNet.
We propose TP-UNet that utilizes temporal prompts, encompassing organ-construction relationships, to guide the segmentation UNet model.
Our framework is featured with cross-attention and semantic alignment based on unsupervised contrastive learning to combine temporal prompts and image features effectively.
arXiv Detail & Related papers (2024-11-18T06:01:00Z) - ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image [4.076537350106898]
We present emphScribblePrompt, a flexible neural network based interactive segmentation tool for biomedical imaging.
In a user study with domain experts, ScribblePrompt reduced annotation time by 28% while improving Dice by 15% compared to the next best method.
We showcase ScribblePrompt in an interactive demo, provide code, and release a dataset of scribble annotations at https://scribbleprompt.csail.mit.edu.
arXiv Detail & Related papers (2023-12-12T15:57:03Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - InterFormer: Real-time Interactive Image Segmentation [80.45763765116175]
Interactive image segmentation enables annotators to efficiently perform pixel-level annotation for segmentation tasks.
The existing interactive segmentation pipeline suffers from inefficient computations of interactive models.
We propose a method named InterFormer that follows a new pipeline to address these issues.
arXiv Detail & Related papers (2023-04-06T08:57:00Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Interactive Segmentation for COVID-19 Infection Quantification on
Longitudinal CT scans [40.721386089781895]
Consistent segmentation of COVID-19 patient's CT scans across multiple time points is essential to assess disease progression and response to therapy accurately.
Existing automatic and interactive segmentation models for medical images only use data from a single time point (static)
We propose a new single network model for interactive segmentation that fully utilizes all available past information to refine the segmentation of follow-up scans.
arXiv Detail & Related papers (2021-10-03T08:06:38Z) - MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from
Medical Images Using Deep Learning [15.01235930304888]
We propose a novel deep learning-based interactive segmentation method that has high efficiency due to only requiring clicks as user inputs.
Our proposed framework achieves accurate results with fewer user interactions and less time compared with state-of-the-art interactive frameworks.
arXiv Detail & Related papers (2021-04-25T14:15:17Z) - Multi-Stage Fusion for One-Click Segmentation [20.00726292545008]
We propose a new multi-stage guidance framework for interactive segmentation.
Our proposed framework has a negligible increase in parameter count compared to early-fusion frameworks.
arXiv Detail & Related papers (2020-10-19T17:07:40Z) - Towards Cross-modality Medical Image Segmentation with Online Mutual
Knowledge Distillation [71.89867233426597]
In this paper, we aim to exploit the prior knowledge learned from one modality to improve the segmentation performance on another modality.
We propose a novel Mutual Knowledge Distillation scheme to thoroughly exploit the modality-shared knowledge.
Experimental results on the public multi-class cardiac segmentation data, i.e., MMWHS 2017, show that our method achieves large improvements on CT segmentation.
arXiv Detail & Related papers (2020-10-04T10:25:13Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.