DIAMANT: Dual Image-Attention Map Encoders For Medical Image
Segmentation
- URL: http://arxiv.org/abs/2304.14571v1
- Date: Fri, 28 Apr 2023 00:11:18 GMT
- Title: DIAMANT: Dual Image-Attention Map Encoders For Medical Image
Segmentation
- Authors: Yousef Yeganeh, Azade Farshad, Peter Weinberger, Seyed-Ahmad Ahmadi,
Ehsan Adeli, Nassir Navab
- Abstract summary: We show that by taking advantage of the attention map visualizations obtained from a self-supervised pretrained vision transformer network (e.g., DINO) one can outperform complex transformer-based networks with much less computation costs.
The results of our experiments on two publicly available medical imaging datasets show that the proposed pipeline outperforms U-Net and the state-of-the-art medical image segmentation models.
- Score: 46.19060502876747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although purely transformer-based architectures showed promising performance
in many computer vision tasks, many hybrid models consisting of CNN and
transformer blocks are introduced to fit more specialized tasks. Nevertheless,
despite the performance gain of both pure and hybrid transformer-based
architectures compared to CNNs in medical imaging segmentation, their high
training cost and complexity make it challenging to use them in real scenarios.
In this work, we propose simple architectures based on purely convolutional
layers, and show that by just taking advantage of the attention map
visualizations obtained from a self-supervised pretrained vision transformer
network (e.g., DINO) one can outperform complex transformer-based networks with
much less computation costs. The proposed architecture is composed of two
encoder branches with the original image as input in one branch and the
attention map visualizations of the same image from multiple self-attention
heads from a pre-trained DINO model (as multiple channels) in the other branch.
The results of our experiments on two publicly available medical imaging
datasets show that the proposed pipeline outperforms U-Net and the
state-of-the-art medical image segmentation models.
Related papers
- Rethinking Attention Gated with Hybrid Dual Pyramid Transformer-CNN for Generalized Segmentation in Medical Imaging [17.07490339960335]
We introduce a novel hybrid CNN-Transformer segmentation architecture (PAG-TransYnet) designed for efficiently building a strong CNN-Transformer encoder.
Our approach exploits attention gates within a Dual Pyramid hybrid encoder.
arXiv Detail & Related papers (2024-04-28T14:37:10Z) - SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
Image Segmentation [0.0]
We propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation.
In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images.
By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements.
arXiv Detail & Related papers (2023-10-16T01:13:38Z) - 3D TransUNet: Advancing Medical Image Segmentation through Vision
Transformers [40.21263511313524]
Medical image segmentation plays a crucial role in advancing healthcare systems for disease diagnosis and treatment planning.
The u-shaped architecture, popularly known as U-Net, has proven highly successful for various medical image segmentation tasks.
To address these limitations, researchers have turned to Transformers, renowned for their global self-attention mechanisms.
arXiv Detail & Related papers (2023-10-11T18:07:19Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - MISSFormer: An Effective Medical Image Segmentation Transformer [3.441872541209065]
CNN-based methods have achieved impressive results in medical image segmentation.
Transformer-based methods are popular in vision tasks recently because of its capacity of long-range dependencies.
We present MISSFormer, an effective and powerful Medical Image tranSFormer.
arXiv Detail & Related papers (2021-09-15T08:56:00Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.