Medical Image Segmentation using Squeeze-and-Expansion Transformers
- URL: http://arxiv.org/abs/2105.09511v2
- Date: Sun, 23 May 2021 12:11:20 GMT
- Title: Medical Image Segmentation using Squeeze-and-Expansion Transformers
- Authors: Shaohua Li, Xiuchao Sui, Xiangde Luo, Xinxing Xu, Yong Liu, Rick Siow
Mong Goh
- Abstract summary: Segtran is an alternative segmentation framework based on transformers.
Segtran consistently achieved the highest segmentation accuracy, and exhibited good cross-domain generalization capabilities.
- Score: 12.793250990122557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical image segmentation is important for computer-aided diagnosis. Good
segmentation demands the model to see the big picture and fine details
simultaneously, i.e., to learn image features that incorporate large context
while keep high spatial resolutions. To approach this goal, the most widely
used methods -- U-Net and variants, extract and fuse multi-scale features.
However, the fused features still have small "effective receptive fields" with
a focus on local image cues, limiting their performance. In this work, we
propose Segtran, an alternative segmentation framework based on transformers,
which have unlimited "effective receptive fields" even at high feature
resolutions. The core of Segtran is a novel Squeeze-and-Expansion transformer:
a squeezed attention block regularizes the self attention of transformers, and
an expansion block learns diversified representations. Additionally, we propose
a new positional encoding scheme for transformers, imposing a continuity
inductive bias for images. Experiments were performed on 2D and 3D medical
image segmentation tasks: optic disc/cup segmentation in fundus images
(REFUGE'20 challenge), polyp segmentation in colonoscopy images, and brain
tumor segmentation in MRI scans (BraTS'19 challenge). Compared with
representative existing methods, Segtran consistently achieved the highest
segmentation accuracy, and exhibited good cross-domain generalization
capabilities. The source code of Segtran is released at
https://github.com/askerlee/segtran.
Related papers
- TransResNet: Integrating the Strengths of ViTs and CNNs for High Resolution Medical Image Segmentation via Feature Grafting [6.987177704136503]
High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method.
Most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images.
We propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently.
arXiv Detail & Related papers (2024-10-01T18:22:34Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Accurate Image Restoration with Attention Retractable Transformer [50.05204240159985]
We propose Attention Retractable Transformer (ART) for image restoration.
ART presents both dense and sparse attention modules in the network.
We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks.
arXiv Detail & Related papers (2022-10-04T07:35:01Z) - HiFormer: Hierarchical Multi-scale Representations Using Transformers
for Medical Image Segmentation [3.478921293603811]
HiFormer is a novel method that efficiently bridges a CNN and a transformer for medical image segmentation.
To secure a fine fusion of global and local features, we propose a Double-Level Fusion (DLF) module in the skip connection of the encoder-decoder structure.
arXiv Detail & Related papers (2022-07-18T11:30:06Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors
in MRI Images [7.334185314342017]
We propose a novel segmentation model termed Swin UNEt TRansformers (Swin UNETR)
The model extracts features at five different resolutions by utilizing shifted windows for computing self-attention.
We have participated in BraTS 2021 segmentation challenge, and our proposed model ranks among the top-performing approaches in the validation phase.
arXiv Detail & Related papers (2022-01-04T18:01:34Z) - Transformer-Unet: Raw Image Processing with Unet [4.7944896477309555]
We propose Transformer-Unet by adding transformer modules in raw images instead of feature maps in Unet.
We form an end-to-end network and gain segmentation results better than many previous Unet based algorithms in our experiment.
arXiv Detail & Related papers (2021-09-17T09:03:10Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.