TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism
for a Deep Segmentation Model
- URL: http://arxiv.org/abs/2207.13415v1
- Date: Wed, 27 Jul 2022 09:54:10 GMT
- Title: TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism
for a Deep Segmentation Model
- Authors: Reza Azad, Mohammad T. AL-Antary, Moein Heidari, Dorit Merhof
- Abstract summary: convolutional neural networks (CNNs) have been the prevailing technique in the medical image processing era.
We propose Trans-Norm, a novel deep segmentation framework which consolidates a Transformer module into both encoder and skip-connections of the standard U-Net.
- Score: 4.320393382724066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past few years, convolutional neural networks (CNNs), particularly
U-Net, have been the prevailing technique in the medical image processing era.
Specifically, the seminal U-Net, as well as its alternatives, have successfully
managed to address a wide variety of medical image segmentation tasks. However,
these architectures are intrinsically imperfect as they fail to exhibit
long-range interactions and spatial dependencies leading to a severe
performance drop in the segmentation of medical images with variable shapes and
structures. Transformers, preliminary proposed for sequence-to-sequence
prediction, have arisen as surrogate architectures to precisely model global
information assisted by the self-attention mechanism. Despite being feasibly
designed, utilizing a pure Transformer for image segmentation purposes can
result in limited localization capacity stemming from inadequate low-level
features. Thus, a line of research strives to design robust variants of
Transformer-based U-Net. In this paper, we propose Trans-Norm, a novel deep
segmentation framework which concomitantly consolidates a Transformer module
into both encoder and skip-connections of the standard U-Net. We argue that the
expedient design of skip-connections can be crucial for accurate segmentation
as it can assist in feature fusion between the expanding and contracting paths.
In this respect, we derive a Spatial Normalization mechanism from the
Transformer module to adaptively recalibrate the skip connection path.
Extensive experiments across three typical tasks for medical image segmentation
demonstrate the effectiveness of TransNorm. The codes and trained models are
publicly available at https://github.com/rezazad68/transnorm.
Related papers
- TransUKAN:Computing-Efficient Hybrid KAN-Transformer for Enhanced Medical Image Segmentation [5.280523424712006]
U-Net is currently the most widely used architecture for medical image segmentation.
We have improved the KAN to reduce memory usage and computational load.
This approach enhances the model's capability to capture nonlinear relationships.
arXiv Detail & Related papers (2024-09-23T02:52:49Z) - SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
Image Segmentation [0.0]
We propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation.
In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images.
By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements.
arXiv Detail & Related papers (2023-10-16T01:13:38Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - DS-TransUNet:Dual Swin Transformer U-Net for Medical Image Segmentation [18.755217252996754]
We propose a novel deep medical image segmentation framework called Dual Swin Transformer U-Net (DS-TransUNet)
Unlike many prior Transformer-based solutions, the proposed DS-TransUNet first adopts dual-scale encoderworks based on Swin Transformer to extract the coarse and fine-grained feature representations of different semantic scales.
As the core component for our DS-TransUNet, a well-designed Transformer Interactive Fusion (TIF) module is proposed to effectively establish global dependencies between features of different scales through the self-attention mechanism.
arXiv Detail & Related papers (2021-06-12T08:37:17Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.