MISSFormer: An Effective Medical Image Segmentation Transformer
- URL: http://arxiv.org/abs/2109.07162v1
- Date: Wed, 15 Sep 2021 08:56:00 GMT
- Title: MISSFormer: An Effective Medical Image Segmentation Transformer
- Authors: Xiaohong Huang, Zhifang Deng, Dandan Li, Xueguang Yuan
- Abstract summary: CNN-based methods have achieved impressive results in medical image segmentation.
Transformer-based methods are popular in vision tasks recently because of its capacity of long-range dependencies.
We present MISSFormer, an effective and powerful Medical Image tranSFormer.
- Score: 3.441872541209065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The CNN-based methods have achieved impressive results in medical image
segmentation, but it failed to capture the long-range dependencies due to the
inherent locality of convolution operation. Transformer-based methods are
popular in vision tasks recently because of its capacity of long-range
dependencies and get a promising performance. However, it lacks in modeling
local context, although some works attempted to embed convolutional layer to
overcome this problem and achieved some improvement, but it makes the feature
inconsistent and fails to leverage the natural multi-scale features of
hierarchical transformer, which limit the performance of models. In this paper,
taking medical image segmentation as an example, we present MISSFormer, an
effective and powerful Medical Image Segmentation tranSFormer. MISSFormer is a
hierarchical encoder-decoder network and has two appealing designs: 1) A feed
forward network is redesigned with the proposed Enhanced Transformer Block,
which makes features aligned adaptively and enhances the long-range
dependencies and local context. 2) We proposed Enhanced Transformer Context
Bridge, a context bridge with the enhanced transformer block to model the
long-range dependencies and local context of multi-scale features generated by
our hierarchical transformer encoder. Driven by these two designs, the
MISSFormer shows strong capacity to capture more valuable dependencies and
context in medical image segmentation. The experiments on multi-organ and
cardiac segmentation tasks demonstrate the superiority, effectiveness and
robustness of our MISSFormer, the exprimental results of MISSFormer trained
from scratch even outperforms state-of-the-art methods pretrained on ImageNet,
and the core designs can be generalized to other visual segmentation tasks. The
code will be released in Github.
Related papers
- Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
Image Segmentation [0.0]
We propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation.
In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images.
By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements.
arXiv Detail & Related papers (2023-10-16T01:13:38Z) - Enhancing Medical Image Segmentation with TransCeption: A Multi-Scale
Feature Fusion Approach [3.9548535445908928]
CNN-based methods have been the cornerstone of medical image segmentation due to their promising performance and robustness.
Transformer-based approaches are currently prevailing since they enlarge the reception field to model global contextual correlation.
We propose TransCeption for medical image segmentation, a pure transformer-based U-shape network featured by incorporating the inception-like module into the encoder.
arXiv Detail & Related papers (2023-01-25T22:09:07Z) - TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism
for a Deep Segmentation Model [4.320393382724066]
convolutional neural networks (CNNs) have been the prevailing technique in the medical image processing era.
We propose Trans-Norm, a novel deep segmentation framework which consolidates a Transformer module into both encoder and skip-connections of the standard U-Net.
arXiv Detail & Related papers (2022-07-27T09:54:10Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - DS-TransUNet:Dual Swin Transformer U-Net for Medical Image Segmentation [18.755217252996754]
We propose a novel deep medical image segmentation framework called Dual Swin Transformer U-Net (DS-TransUNet)
Unlike many prior Transformer-based solutions, the proposed DS-TransUNet first adopts dual-scale encoderworks based on Swin Transformer to extract the coarse and fine-grained feature representations of different semantic scales.
As the core component for our DS-TransUNet, a well-designed Transformer Interactive Fusion (TIF) module is proposed to effectively establish global dependencies between features of different scales through the self-attention mechanism.
arXiv Detail & Related papers (2021-06-12T08:37:17Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.