Related papers: Contextual Attention Network: Transformer Meets U-Net

Contextual Attention Network: Transformer Meets U-Net

URL: http://arxiv.org/abs/2203.01932v1
Date: Wed, 2 Mar 2022 21:10:24 GMT
Title: Contextual Attention Network: Transformer Meets U-Net
Authors: Azad Reza, Heidari Moein, Wu Yuli, Merhof Dorit
Abstract summary: convolutional neural networks (CNN) have become the de facto standard and attained immense success in medical image segmentation. However, CNN based methods fail to build long-range dependencies and global context connections. Recent articles have exploited Transformer variants for medical image segmentation tasks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the de facto standard and attained immense success in medical image segmentation. However, as a downside, CNN based methods are a double-edged sword as they fail to build long-range dependencies and global context connections due to the limited receptive field that stems from the intrinsic characteristics of the convolution operation. Hence, recent articles have exploited Transformer variants for medical image segmentation tasks which open up great opportunities due to their innate capability of capturing long-range correlations through the attention mechanism. Although being feasibly designed, most of the cohort studies incur prohibitive performance in capturing local information, thereby resulting in less lucidness of boundary areas. In this paper, we propose a contextual attention network to tackle the aforementioned limitations. The proposed method uses the strength of the Transformer module to model the long-range contextual dependency. Simultaneously, it utilizes the CNN encoder to capture local semantic information. In addition, an object-level representation is included to model the regional interaction map. The extracted hierarchical features are then fed to the contextual attention module to adaptively recalibrate the representation space using the local information. Then, they emphasize the informative regions while taking into account the long-range contextual dependency derived by the Transformer module. We validate our method on several large-scale public medical image segmentation datasets and achieve state-of-the-art performance. We have provided the implementation code in https://github.com/rezazad68/TMUnet.

Related papers

STA-Unet: Rethink the semantic redundant for Medical Imaging Segmentation [1.9526521731584066]
Super Token Attention (STA) mechanism adapts the concept of superpixels from pixel space to token space, using super tokens as compact visual representations. In this work, we introduce the STA module in the UNet architecture (STA-UNet), to limit redundancy without losing rich information. Experimental results on four publicly available datasets demonstrate the superiority of STA-UNet over existing state-of-the-art architectures.
arXiv Detail & Related papers (2024-10-13T07:19:46Z)
BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image Segmentation [0.0]
This paper proposes an innovative U-shaped network called BEFUnet, which enhances the fusion of body and edge information for precise medical image segmentation. The BEFUnet comprises three main modules, including a novel Local Cross-Attention Feature (LCAF) fusion module, a novel Double-Level Fusion (DLF) module, and dual-branch encoder. The LCAF module efficiently fuses edge and body features by selectively performing local cross-attention on features that are spatially close between the two modalities.
arXiv Detail & Related papers (2024-02-13T21:03:36Z)
ParaTransCNN: Parallelized TransCNN Encoder for Medical Image Segmentation [7.955518153976858]
We propose an advanced 2D feature extraction method by combining the convolutional neural network and Transformer architectures. Our method is shown with better segmentation accuracy, especially on small organs.
arXiv Detail & Related papers (2024-01-27T05:58:36Z)
ConvFormer: Combining CNN and Transformer for Medical Image Segmentation [17.88894109620463]
We propose a hierarchical CNN and Transformer hybrid architecture, called ConvFormer, for medical image segmentation. Our ConvFormer, trained from scratch, outperforms various CNN- or Transformer-based architectures, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-11-15T23:11:22Z)
LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers [60.51925353387151]
We propose a novel module named Local Context Propagation (LCP) to exploit the message passing between neighboring local regions. We use the overlap points of adjacent local regions as intermediaries, then re-weight the features of these shared points from different local regions before passing them to the next layers. The proposed method is applicable to different tasks and outperforms various transformer-based methods in benchmarks including 3D shape classification and dense prediction tasks.
arXiv Detail & Related papers (2022-10-23T15:43:01Z)
MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation. It simultaneously learns global semantic information and local spatial-detailed features. Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z)
Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks. Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z)
Transformers Solve the Limited Receptive Field for Monocular Depth Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers. This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems. On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard. We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.