TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for
Medical Image Segmentation
- URL: http://arxiv.org/abs/2306.04086v3
- Date: Wed, 20 Dec 2023 02:34:49 GMT
- Title: TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for
Medical Image Segmentation
- Authors: Rui Sun, Tao Lei, Weichuan Zhang, Yong Wan, Yong Xia, Asoke K. Nandi
- Abstract summary: We propose vision Transformer embrace convolutional neural networks for medical image segmentation (TEC-Net)
Our network has two advantages. First, dynamic deformable convolution (DDConv) is designed in the CNN branch, which not only overcomes the difficulty of adaptive feature extraction using fixed-size convolution kernels, but also solves the defect that different inputs share the same convolution kernel parameters.
Experimental results show that the proposed TEC-Net provides better medical image segmentation results than SOTA methods including CNN and Transformer networks.
- Score: 20.976167468217387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The hybrid architecture of convolution neural networks (CNN) and Transformer
has been the most popular method for medical image segmentation. However, the
existing networks based on the hybrid architecture suffer from two problems.
First, although the CNN branch can capture image local features by using
convolution operation, the vanilla convolution is unable to achieve adaptive
extraction of image features. Second, although the Transformer branch can model
the global information of images, the conventional self-attention only focuses
on the spatial self-attention of images and ignores the channel and
cross-dimensional self-attention leading to low segmentation accuracy for
medical images with complex backgrounds. To solve these problems, we propose
vision Transformer embrace convolutional neural networks for medical image
segmentation (TEC-Net). Our network has two advantages. First, dynamic
deformable convolution (DDConv) is designed in the CNN branch, which not only
overcomes the difficulty of adaptive feature extraction using fixed-size
convolution kernels, but also solves the defect that different inputs share the
same convolution kernel parameters, effectively improving the feature
expression ability of CNN branch. Second, in the Transformer branch, a
(shifted)-window adaptive complementary attention module ((S)W-ACAM) and
compact convolutional projection are designed to enable the network to fully
learn the cross-dimensional long-range dependency of medical images with few
parameters and calculations. Experimental results show that the proposed
TEC-Net provides better medical image segmentation results than SOTA methods
including CNN and Transformer networks. In addition, our TEC-Net requires fewer
parameters and computational costs and does not rely on pre-training. The code
is publicly available at https://github.com/SR0920/TEC-Net.
Related papers
- Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - CiT-Net: Convolutional Neural Networks Hand in Hand with Vision
Transformers for Medical Image Segmentation [10.20771849219059]
We propose a novel hybrid architecture of convolutional neural networks (CNNs) and vision Transformers (CiT-Net) for medical image segmentation.
Our CiT-Net provides better medical image segmentation results than popular SOTA methods.
arXiv Detail & Related papers (2023-06-06T03:22:22Z) - Optimizing Vision Transformers for Medical Image Segmentation and
Few-Shot Domain Adaptation [11.690799827071606]
We propose Convolutional Swin-Unet (CS-Unet) transformer blocks and optimise their settings with relation to patch embedding, projection, the feed-forward network, up sampling and skip connections.
CS-Unet can be trained from scratch and inherits the superiority of convolutions in each feature process phase.
Experiments show that CS-Unet without pre-training surpasses other state-of-the-art counterparts by large margins on two medical CT and MRI datasets with fewer parameters.
arXiv Detail & Related papers (2022-10-14T19:18:52Z) - ConvTransSeg: A Multi-resolution Convolution-Transformer Network for
Medical Image Segmentation [14.485482467748113]
We propose a hybrid encoder-decoder segmentation model (ConvTransSeg)
It consists of a multi-layer CNN as the encoder for feature learning and the corresponding multi-level Transformer as the decoder for segmentation prediction.
Our method achieves the best performance in terms of Dice coefficient and average symmetric surface distance measures with low model complexity and memory consumption.
arXiv Detail & Related papers (2022-10-13T14:59:23Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.