Contextual Attention Network: Transformer Meets U-Net
- URL: http://arxiv.org/abs/2203.01932v1
- Date: Wed, 2 Mar 2022 21:10:24 GMT
- Title: Contextual Attention Network: Transformer Meets U-Net
- Authors: Azad Reza, Heidari Moein, Wu Yuli, Merhof Dorit
- Abstract summary: convolutional neural networks (CNN) have become the de facto standard and attained immense success in medical image segmentation.
However, CNN based methods fail to build long-range dependencies and global context connections.
Recent articles have exploited Transformer variants for medical image segmentation tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the
de facto standard and attained immense success in medical image segmentation.
However, as a downside, CNN based methods are a double-edged sword as they fail
to build long-range dependencies and global context connections due to the
limited receptive field that stems from the intrinsic characteristics of the
convolution operation. Hence, recent articles have exploited Transformer
variants for medical image segmentation tasks which open up great opportunities
due to their innate capability of capturing long-range correlations through the
attention mechanism. Although being feasibly designed, most of the cohort
studies incur prohibitive performance in capturing local information, thereby
resulting in less lucidness of boundary areas. In this paper, we propose a
contextual attention network to tackle the aforementioned limitations. The
proposed method uses the strength of the Transformer module to model the
long-range contextual dependency. Simultaneously, it utilizes the CNN encoder
to capture local semantic information. In addition, an object-level
representation is included to model the regional interaction map. The extracted
hierarchical features are then fed to the contextual attention module to
adaptively recalibrate the representation space using the local information.
Then, they emphasize the informative regions while taking into account the
long-range contextual dependency derived by the Transformer module. We validate
our method on several large-scale public medical image segmentation datasets
and achieve state-of-the-art performance. We have provided the implementation
code in https://github.com/rezazad68/TMUnet.
Related papers
- STA-Unet: Rethink the semantic redundant for Medical Imaging Segmentation [1.9526521731584066]
Super Token Attention (STA) mechanism adapts the concept of superpixels from pixel space to token space, using super tokens as compact visual representations.
In this work, we introduce the STA module in the UNet architecture (STA-UNet), to limit redundancy without losing rich information.
Experimental results on four publicly available datasets demonstrate the superiority of STA-UNet over existing state-of-the-art architectures.
arXiv Detail & Related papers (2024-10-13T07:19:46Z) - BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image
Segmentation [0.0]
This paper proposes an innovative U-shaped network called BEFUnet, which enhances the fusion of body and edge information for precise medical image segmentation.
The BEFUnet comprises three main modules, including a novel Local Cross-Attention Feature (LCAF) fusion module, a novel Double-Level Fusion (DLF) module, and dual-branch encoder.
The LCAF module efficiently fuses edge and body features by selectively performing local cross-attention on features that are spatially close between the two modalities.
arXiv Detail & Related papers (2024-02-13T21:03:36Z) - ParaTransCNN: Parallelized TransCNN Encoder for Medical Image
Segmentation [7.955518153976858]
We propose an advanced 2D feature extraction method by combining the convolutional neural network and Transformer architectures.
Our method is shown with better segmentation accuracy, especially on small organs.
arXiv Detail & Related papers (2024-01-27T05:58:36Z) - ConvFormer: Combining CNN and Transformer for Medical Image Segmentation [17.88894109620463]
We propose a hierarchical CNN and Transformer hybrid architecture, called ConvFormer, for medical image segmentation.
Our ConvFormer, trained from scratch, outperforms various CNN- or Transformer-based architectures, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-11-15T23:11:22Z) - LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
Propagation in Transformers [60.51925353387151]
We propose a novel module named Local Context Propagation (LCP) to exploit the message passing between neighboring local regions.
We use the overlap points of adjacent local regions as intermediaries, then re-weight the features of these shared points from different local regions before passing them to the next layers.
The proposed method is applicable to different tasks and outperforms various transformer-based methods in benchmarks including 3D shape classification and dense prediction tasks.
arXiv Detail & Related papers (2022-10-23T15:43:01Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.