TransAttUnet: Multi-level Attention-guided U-Net with Transformer for
Medical Image Segmentation
- URL: http://arxiv.org/abs/2107.05274v1
- Date: Mon, 12 Jul 2021 09:17:06 GMT
- Title: TransAttUnet: Multi-level Attention-guided U-Net with Transformer for
Medical Image Segmentation
- Authors: Bingzhi Chen, Yishu Liu, Zheng Zhang, Guangming Lu, David Zhang
- Abstract summary: This paper proposes a novel Transformer based medical image semantic segmentation framework called TransAttUnet.
In particular, we establish additional multi-scale skip connections between decoder blocks to aggregate the different semantic-scale upsampling features.
Our method consistently outperforms the state-of-the-art baselines.
- Score: 33.45471457058221
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the development of deep encoder-decoder architectures and large-scale
annotated medical datasets, great progress has been achieved in the development
of automatic medical image segmentation. Due to the stacking of convolution
layers and the consecutive sampling operations, existing standard models
inevitably encounter the information recession problem of feature
representations, which fails to fully model the global contextual feature
dependencies. To overcome the above challenges, this paper proposes a novel
Transformer based medical image semantic segmentation framework called
TransAttUnet, in which the multi-level guided attention and multi-scale skip
connection are jointly designed to effectively enhance the functionality and
flexibility of traditional U-shaped architecture. Inspired by Transformer, a
novel self-aware attention (SAA) module with both Transformer Self Attention
(TSA) and Global Spatial Attention (GSA) is incorporated into TransAttUnet to
effectively learn the non-local interactions between encoder features. In
particular, we also establish additional multi-scale skip connections between
decoder blocks to aggregate the different semantic-scale upsampling features.
In this way, the representation ability of multi-scale context information is
strengthened to generate discriminative features. Benefitting from these
complementary components, the proposed TransAttUnet can effectively alleviate
the loss of fine details caused by the information recession problem, improving
the diagnostic sensitivity and segmentation quality of medical image analysis.
Extensive experiments on multiple medical image segmentation datasets of
different imaging demonstrate that our method consistently outperforms the
state-of-the-art baselines.
Related papers
- TransResNet: Integrating the Strengths of ViTs and CNNs for High Resolution Medical Image Segmentation via Feature Grafting [6.987177704136503]
High-resolution images are preferable in medical imaging domain as they significantly improve the diagnostic capability of the underlying method.
Most of the existing deep learning-based techniques for medical image segmentation are optimized for input images having small spatial dimensions and perform poorly on high-resolution images.
We propose a parallel-in-branch architecture called TransResNet, which incorporates Transformer and CNN in a parallel manner to extract features from multi-resolution images independently.
arXiv Detail & Related papers (2024-10-01T18:22:34Z) - BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image
Segmentation [0.0]
This paper proposes an innovative U-shaped network called BEFUnet, which enhances the fusion of body and edge information for precise medical image segmentation.
The BEFUnet comprises three main modules, including a novel Local Cross-Attention Feature (LCAF) fusion module, a novel Double-Level Fusion (DLF) module, and dual-branch encoder.
The LCAF module efficiently fuses edge and body features by selectively performing local cross-attention on features that are spatially close between the two modalities.
arXiv Detail & Related papers (2024-02-13T21:03:36Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - DA-TransUNet: Integrating Spatial and Channel Dual Attention with
Transformer U-Net for Medical Image Segmentation [5.5582646801199225]
This study proposes a novel deep medical image segmentation framework, called DA-TransUNet.
It aims to integrate the Transformer and dual attention block(DA-Block) into the traditional U-shaped architecture.
Unlike earlier transformer-based U-net models, DA-TransUNet utilizes Transformers and DA-Block to integrate not only global and local features, but also image-specific positional and channel features.
arXiv Detail & Related papers (2023-10-19T08:25:03Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical
Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.
Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation
with Transformers [8.139069987207494]
We present TransFusion, a Transformer-based architecture to merge divergent multi-view imaging information using convolutional layers and powerful attention mechanisms.
In particular, the Divergent Fusion Attention (DiFA) module is proposed for rich cross-view context modeling and semantic dependency mining.
arXiv Detail & Related papers (2022-03-21T04:02:54Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.