BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image
Segmentation
- URL: http://arxiv.org/abs/2402.08793v1
- Date: Tue, 13 Feb 2024 21:03:36 GMT
- Title: BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image
Segmentation
- Authors: Omid Nejati Manzari, Javad Mirzapour Kaleybar, Hooman Saadat, Shahin
Maleki
- Abstract summary: This paper proposes an innovative U-shaped network called BEFUnet, which enhances the fusion of body and edge information for precise medical image segmentation.
The BEFUnet comprises three main modules, including a novel Local Cross-Attention Feature (LCAF) fusion module, a novel Double-Level Fusion (DLF) module, and dual-branch encoder.
The LCAF module efficiently fuses edge and body features by selectively performing local cross-attention on features that are spatially close between the two modalities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The accurate segmentation of medical images is critical for various
healthcare applications. Convolutional neural networks (CNNs), especially Fully
Convolutional Networks (FCNs) like U-Net, have shown remarkable success in
medical image segmentation tasks. However, they have limitations in capturing
global context and long-range relations, especially for objects with
significant variations in shape, scale, and texture. While transformers have
achieved state-of-the-art results in natural language processing and image
recognition, they face challenges in medical image segmentation due to image
locality and translational invariance issues. To address these challenges, this
paper proposes an innovative U-shaped network called BEFUnet, which enhances
the fusion of body and edge information for precise medical image segmentation.
The BEFUnet comprises three main modules, including a novel Local
Cross-Attention Feature (LCAF) fusion module, a novel Double-Level Fusion (DLF)
module, and dual-branch encoder. The dual-branch encoder consists of an edge
encoder and a body encoder. The edge encoder employs PDC blocks for effective
edge information extraction, while the body encoder uses the Swin Transformer
to capture semantic information with global attention. The LCAF module
efficiently fuses edge and body features by selectively performing local
cross-attention on features that are spatially close between the two
modalities. This local approach significantly reduces computational complexity
compared to global cross-attention while ensuring accurate feature matching.
BEFUnet demonstrates superior performance over existing methods across various
evaluation metrics on medical image segmentation datasets.
Related papers
- ParaTransCNN: Parallelized TransCNN Encoder for Medical Image
Segmentation [7.955518153976858]
We propose an advanced 2D feature extraction method by combining the convolutional neural network and Transformer architectures.
Our method is shown with better segmentation accuracy, especially on small organs.
arXiv Detail & Related papers (2024-01-27T05:58:36Z) - BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image
Segmentation [11.986549780782724]
We propose a hybrid yet effective CNN-Transformer network, named BRAU-Net++, for an accurate medical image segmentation task.
Specifically, BRAU-Net++ uses bi-level routing attention as the core building block to design our u-shaped encoder-decoder structure.
Our proposed approach surpasses other state-of-the-art methods including its baseline: BRAU-Net.
arXiv Detail & Related papers (2024-01-01T10:49:09Z) - MCPA: Multi-scale Cross Perceptron Attention Network for 2D Medical
Image Segmentation [7.720152925974362]
We propose a 2D medical image segmentation model called Multi-scale Cross Perceptron Attention Network (MCPA)
The MCPA consists of three main components: an encoder, a decoder, and a Cross Perceptron.
We evaluate our proposed MCPA model on several publicly available medical image datasets from different tasks and devices.
arXiv Detail & Related papers (2023-07-27T02:18:12Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical
Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.
Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - Boundary-aware Context Neural Network for Medical Image Segmentation [15.585851505721433]
Medical image segmentation can provide reliable basis for further clinical analysis and disease diagnosis.
Most existing CNNs-based methods produce unsatisfactory segmentation mask without accurate object boundaries.
In this paper, we formulate a boundary-aware context neural network (BA-Net) for 2D medical image segmentation.
arXiv Detail & Related papers (2020-05-03T02:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.