TP-UNet: Temporal Prompt Guided UNet for Medical Image Segmentation
- URL: http://arxiv.org/abs/2411.11305v2
- Date: Wed, 20 Nov 2024 02:24:26 GMT
- Title: TP-UNet: Temporal Prompt Guided UNet for Medical Image Segmentation
- Authors: Ranmin Wang, Limin Zhuang, Hongkun Chen, Boyan Xu, Ruichu Cai,
- Abstract summary: The order of organs in scanned images has been disregarded by current medical image segmentation approaches based on UNet.
We propose TP-UNet that utilizes temporal prompts, encompassing organ-construction relationships, to guide the segmentation UNet model.
Our framework is featured with cross-attention and semantic alignment based on unsupervised contrastive learning to combine temporal prompts and image features effectively.
- Score: 11.207258450032205
- License:
- Abstract: The advancement of medical image segmentation techniques has been propelled by the adoption of deep learning techniques, particularly UNet-based approaches, which exploit semantic information to improve the accuracy of segmentations. However, the order of organs in scanned images has been disregarded by current medical image segmentation approaches based on UNet. Furthermore, the inherent network structure of UNet does not provide direct capabilities for integrating temporal information. To efficiently integrate temporal information, we propose TP-UNet that utilizes temporal prompts, encompassing organ-construction relationships, to guide the segmentation UNet model. Specifically, our framework is featured with cross-attention and semantic alignment based on unsupervised contrastive learning to combine temporal prompts and image features effectively. Extensive evaluations on two medical image segmentation datasets demonstrate the state-of-the-art performance of TP-UNet. Our implementation will be open-sourced after acceptance.
Related papers
- A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation [12.948027961485536]
We propose a novel Weakly Supervised Semantic (WSSS) approach that integrates structural guidance with text-driven strategies to generate high-quality pseudo labels.
Our method achieves state-of-the-art performance, highlighting its potential to improve diagnostic accuracy and efficiency in medical imaging.
arXiv Detail & Related papers (2024-11-19T16:20:27Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - Real-time guidewire tracking and segmentation in intraoperative x-ray [52.51797358201872]
We propose a two-stage deep learning framework for real-time guidewire segmentation and tracking.
In the first stage, a Yolov5 detector is trained, using the original X-ray images as well as synthetic ones, to output the bounding boxes of possible target guidewires.
In the second stage, a novel and efficient network is proposed to segment the guidewire in each detected bounding box.
arXiv Detail & Related papers (2024-04-12T20:39:19Z) - PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation [9.744164910887223]
We introduce a novel framework equipped with prompt enhancement, called PE-MED, for interactive medical image segmentation.
First, we introduce a Self-Loop strategy to generate warm initial segmentation results based on the first prompt.
Second, we propose a novel Prompt Attention Learning Module (PALM) to mine useful prompt information in one interaction.
arXiv Detail & Related papers (2023-08-26T03:11:48Z) - Structure-aware registration network for liver DCE-CT images [50.28546654316009]
We propose a novel structure-aware registration method by incorporating structural information of related organs with segmentation-guided deep registration network.
Our proposed method can achieve higher registration accuracy and preserve anatomical structure more effectively than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-08T14:08:56Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Local Spatiotemporal Representation Learning for
Longitudinally-consistent Neuroimage Analysis [7.568469725821069]
This paper presents a local and multi-scaletemporal representation learning method for image-to-image architectures trained on longitudinal images.
During finetuning, it proposes a surprisingly simple self-supervised segmentation consistency regularization to exploit intrasubject correlation.
These improvements are demonstrated across both longitudinal neurodegenerative adult and developing infant brain MRI and yield both higher performance and longitudinal consistency.
arXiv Detail & Related papers (2022-06-09T05:17:00Z) - Cross-level Contrastive Learning and Consistency Constraint for
Semi-supervised Medical Image Segmentation [46.678279106837294]
We propose a cross-level constrastive learning scheme to enhance representation capacity for local features in semi-supervised medical image segmentation.
With the help of the cross-level contrastive learning and consistency constraint, the unlabelled data can be effectively explored to improve segmentation performance.
arXiv Detail & Related papers (2022-02-08T15:12:11Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.