MS-Twins: Multi-Scale Deep Self-Attention Networks for Medical Image
Segmentation
- URL: http://arxiv.org/abs/2312.07128v1
- Date: Tue, 12 Dec 2023 10:04:11 GMT
- Title: MS-Twins: Multi-Scale Deep Self-Attention Networks for Medical Image
Segmentation
- Authors: Jing Xu
- Abstract summary: The article proposes MS-Twins, which is a powerful segmentation model on account of the bond of self-attention and convolution.
MS-Twins has made significant progress on the previous method based on the transformer of two in common use data sets, Synapse and ACDC.
- Score: 6.6467547151592505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although transformer is preferred in natural language processing, few studies
have applied it in the field of medical imaging. For its long-term dependency,
the transformer is expected to contribute to unconventional convolution neural
net conquer their inherent spatial induction bias. The lately suggested
transformer-based partition method only uses the transformer as an auxiliary
module to help encode the global context into a convolutional representation.
There is hardly any study about how to optimum bond self-attention (the kernel
of transformers) with convolution. To solve the problem, the article proposes
MS-Twins (Multi-Scale Twins), which is a powerful segmentation model on account
of the bond of self-attention and convolution. MS-Twins can better capture
semantic and fine-grained information by combining different scales and
cascading features. Compared with the existing network structure, MS-Twins has
made significant progress on the previous method based on the transformer of
two in common use data sets, Synapse and ACDC. In particular, the performance
of MS-Twins on Synapse is 8% higher than SwinUNet. Even compared with nnUNet,
the best entirely convoluted medical image segmentation network, the
performance of MS-Twins on Synapse and ACDC still has a bit advantage.
Related papers
- CiT-Net: Convolutional Neural Networks Hand in Hand with Vision
Transformers for Medical Image Segmentation [10.20771849219059]
We propose a novel hybrid architecture of convolutional neural networks (CNNs) and vision Transformers (CiT-Net) for medical image segmentation.
Our CiT-Net provides better medical image segmentation results than popular SOTA methods.
arXiv Detail & Related papers (2023-06-06T03:22:22Z) - Multi-scale Transformer Network with Edge-aware Pre-training for
Cross-Modality MR Image Synthesis [52.41439725865149]
Cross-modality magnetic resonance (MR) image synthesis can be used to generate missing modalities from given ones.
Existing (supervised learning) methods often require a large number of paired multi-modal data to train an effective synthesis model.
We propose a Multi-scale Transformer Network (MT-Net) with edge-aware pre-training for cross-modality MR image synthesis.
arXiv Detail & Related papers (2022-12-02T11:40:40Z) - Optimizing Vision Transformers for Medical Image Segmentation and
Few-Shot Domain Adaptation [11.690799827071606]
We propose Convolutional Swin-Unet (CS-Unet) transformer blocks and optimise their settings with relation to patch embedding, projection, the feed-forward network, up sampling and skip connections.
CS-Unet can be trained from scratch and inherits the superiority of convolutions in each feature process phase.
Experiments show that CS-Unet without pre-training surpasses other state-of-the-art counterparts by large margins on two medical CT and MRI datasets with fewer parameters.
arXiv Detail & Related papers (2022-10-14T19:18:52Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - The Fully Convolutional Transformer for Medical Image Segmentation [2.87898780282409]
We propose a novel transformer model, capable of segmenting medical images of varying modalities.
The Fully Convolutional Transformer (FCT) is the first fully convolutional Transformer model in medical imaging literature.
arXiv Detail & Related papers (2022-06-01T15:22:41Z) - nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution.
nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z) - DS-TransUNet:Dual Swin Transformer U-Net for Medical Image Segmentation [18.755217252996754]
We propose a novel deep medical image segmentation framework called Dual Swin Transformer U-Net (DS-TransUNet)
Unlike many prior Transformer-based solutions, the proposed DS-TransUNet first adopts dual-scale encoderworks based on Swin Transformer to extract the coarse and fine-grained feature representations of different semantic scales.
As the core component for our DS-TransUNet, a well-designed Transformer Interactive Fusion (TIF) module is proposed to effectively establish global dependencies between features of different scales through the self-attention mechanism.
arXiv Detail & Related papers (2021-06-12T08:37:17Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.