Few-Shot Medical Image Segmentation with High-Fidelity Prototypes
- URL: http://arxiv.org/abs/2406.18074v1
- Date: Wed, 26 Jun 2024 05:06:14 GMT
- Title: Few-Shot Medical Image Segmentation with High-Fidelity Prototypes
- Authors: Song Tang, Shaxu Yan, Xiaozhi Qi, Jianxin Gao, Mao Ye, Jianwei Zhang, Xiatian Zhu,
- Abstract summary: We propose a novel Detail Self-refined Prototype Network (DSPNet) to construct high-fidelity prototypes representing the object foreground and the background more comprehensively.
To construct global semantics while maintaining the captured detail semantics, we learn the foreground prototypes by modelling the multi-modal structures with clustering and then fusing each in a channel-wise manner.
- Score: 38.073371773707514
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot Semantic Segmentation (FSS) aims to adapt a pretrained model to new classes with as few as a single labelled training sample per class. Despite the prototype based approaches have achieved substantial success, existing models are limited to the imaging scenarios with considerably distinct objects and not highly complex background, e.g., natural images. This makes such models suboptimal for medical imaging with both conditions invalid. To address this problem, we propose a novel Detail Self-refined Prototype Network (DSPNet) to constructing high-fidelity prototypes representing the object foreground and the background more comprehensively. Specifically, to construct global semantics while maintaining the captured detail semantics, we learn the foreground prototypes by modelling the multi-modal structures with clustering and then fusing each in a channel-wise manner. Considering that the background often has no apparent semantic relation in the spatial dimensions, we integrate channel-specific structural information under sparse channel-aware regulation. Extensive experiments on three challenging medical image benchmarks show the superiority of DSPNet over previous state-of-the-art methods.
Related papers
- FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - Few-shot Image Generation via Information Transfer from the Built
Geodesic Surface [2.617962830559083]
We propose a method called Information Transfer from the Built Geodesic Surface (ITBGS)
With the FAGS module, a pseudo-source domain is created by projecting image features from the training dataset into the Pre-Shape Space.
We demonstrate that the proposed method consistently achieves optimal or comparable results across a diverse range of semantically distinct datasets.
arXiv Detail & Related papers (2024-01-03T13:57:09Z) - PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation
Models Through Prompt Tuning [35.39822183728463]
We present a novel Prompt-IML framework for detecting tampered images.
Humans tend to discern authenticity of an image based on semantic and high-frequency information.
Our model can achieve better performance on eight typical fake image datasets.
arXiv Detail & Related papers (2024-01-01T03:45:07Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained
Image Foundation Models [13.08275555017179]
We propose ProMISe, a prompt-driven 3D medical image segmentation model using only a single point prompt.
We evaluate our model on two public datasets for colon and pancreas tumor segmentations.
arXiv Detail & Related papers (2023-10-30T16:49:03Z) - MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained
on a Large-Scale Unannotated Dataset [14.823114726604853]
We propose a novel self-supervised learning strategy named Volume Fusion (VF) for pretraining 3D segmentation models.
VF forces the model to predict the fusion coefficient of each voxel, which is formulated as a self-supervised segmentation task without manual annotations.
experiments with different downstream segmentation targets including head and neck organs, thoracic/abdominal organs showed that our pretrained model largely outperformed training from scratch.
arXiv Detail & Related papers (2023-06-29T13:22:13Z) - Interpretable Small Training Set Image Segmentation Network Originated
from Multi-Grid Variational Model [5.283735137946097]
Deep learning (DL) methods have been proposed and widely used for image segmentation.
DL methods usually require a large amount of manually segmented data as training data and suffer from poor interpretability.
In this paper, we replace the hand-crafted regularity term in the MS model with a data adaptive generalized learnable regularity term.
arXiv Detail & Related papers (2023-06-25T02:34:34Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.