Semi-Supervised Medical Image Segmentation via Cross Teaching between
CNN and Transformer
- URL: http://arxiv.org/abs/2112.04894v1
- Date: Thu, 9 Dec 2021 13:22:38 GMT
- Title: Semi-Supervised Medical Image Segmentation via Cross Teaching between
CNN and Transformer
- Authors: Xiangde Luo, Minhao Hu, Tao Song, Guotai Wang, Shaoting Zhang
- Abstract summary: We present a framework for semi-supervised medical image segmentation by introducing the cross teaching between CNN and Transformer.
Notably, this work may be the first attempt to combine CNN and transformer for semi-supervised medical image segmentation and achieve promising results on a public benchmark.
- Score: 11.381487613753004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, deep learning with Convolutional Neural Networks (CNNs) and
Transformers has shown encouraging results in fully supervised medical image
segmentation. However, it is still challenging for them to achieve good
performance with limited annotations for training. In this work, we present a
very simple yet efficient framework for semi-supervised medical image
segmentation by introducing the cross teaching between CNN and Transformer.
Specifically, we simplify the classical deep co-training from consistency
regularization to cross teaching, where the prediction of a network is used as
the pseudo label to supervise the other network directly end-to-end.
Considering the difference in learning paradigm between CNN and Transformer, we
introduce the Cross Teaching between CNN and Transformer rather than just using
CNNs. Experiments on a public benchmark show that our method outperforms eight
existing semi-supervised learning methods just with a simpler framework.
Notably, this work may be the first attempt to combine CNN and transformer for
semi-supervised medical image segmentation and achieve promising results on a
public benchmark. The code will be released at:
https://github.com/HiLab-git/SSL4MIS.
Related papers
- CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation [60.08541107831459]
This paper proposes a CNN-Transformer rectified collaborative learning framework to learn stronger CNN-based and Transformer-based models for medical image segmentation.
Specifically, we propose a rectified logit-wise collaborative learning (RLCL) strategy which introduces the ground truth to adaptively select and rectify the wrong regions in student soft labels.
We also propose a class-aware feature-wise collaborative learning (CFCL) strategy to achieve effective knowledge transfer between CNN-based and Transformer-based models in the feature space.
arXiv Detail & Related papers (2024-08-25T01:27:35Z) - ScribFormer: Transformer Makes CNN Work Better for Scribble-based
Medical Image Segmentation [43.24187067938417]
This paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer.
The proposed ScribFormer model has a triple-branch structure, i.e., the hybrid of a CNN branch, a Transformer branch, and an attention-guided class activation map (ACAM) branch.
arXiv Detail & Related papers (2024-02-03T04:55:22Z) - The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel
Size might be All You Need [103.31261028244782]
Vision Transformers have been rapidly uprising in computer vision thanks to their outstanding scaling trends, and gradually replacing convolutional neural networks (CNNs)
Recent works on self-supervised learning (SSL) introduce siamese pre-training tasks.
People come to believe that Transformers or self-attention modules are inherently more suitable than CNNs in the context of SSL.
arXiv Detail & Related papers (2023-12-09T22:23:57Z) - CiT-Net: Convolutional Neural Networks Hand in Hand with Vision
Transformers for Medical Image Segmentation [10.20771849219059]
We propose a novel hybrid architecture of convolutional neural networks (CNNs) and vision Transformers (CiT-Net) for medical image segmentation.
Our CiT-Net provides better medical image segmentation results than popular SOTA methods.
arXiv Detail & Related papers (2023-06-06T03:22:22Z) - ConvTransSeg: A Multi-resolution Convolution-Transformer Network for
Medical Image Segmentation [14.485482467748113]
We propose a hybrid encoder-decoder segmentation model (ConvTransSeg)
It consists of a multi-layer CNN as the encoder for feature learning and the corresponding multi-level Transformer as the decoder for segmentation prediction.
Our method achieves the best performance in terms of Dice coefficient and average symmetric surface distance measures with low model complexity and memory consumption.
arXiv Detail & Related papers (2022-10-13T14:59:23Z) - SegTransVAE: Hybrid CNN -- Transformer with Regularization for medical
image segmentation [0.0]
A novel network named SegTransVAE is proposed in this paper.
SegTransVAE is built upon encoder-decoder architecture, exploiting transformer with the variational autoencoder (VAE) branch to the network.
Evaluation on various recently introduced datasets shows that SegTransVAE outperforms previous methods in Dice Score and $95%$-Haudorff Distance.
arXiv Detail & Related papers (2022-01-21T08:02:55Z) - Semi-Supervised Vision Transformers [76.83020291497895]
We study the training of Vision Transformers for semi-supervised image classification.
We find Vision Transformers perform poorly on a semi-supervised ImageNet setting.
CNNs achieve superior results in small labeled data regime.
arXiv Detail & Related papers (2021-11-22T09:28:13Z) - Container: Context Aggregation Network [83.12004501984043]
Recent finding shows that a simple based solution without any traditional convolutional or Transformer components can produce effective visual representations.
We present the model (CONText Ion NERtwok), a general-purpose building block for multi-head context aggregation.
In contrast to Transformer-based methods that do not scale well to downstream tasks that rely on larger input image resolutions, our efficient network, named modellight, can be employed in object detection and instance segmentation networks.
arXiv Detail & Related papers (2021-06-02T18:09:11Z) - Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation [63.46694853953092]
Swin-Unet is an Unet-like pure Transformer for medical image segmentation.
tokenized image patches are fed into the Transformer-based U-shaped decoder-Decoder architecture.
arXiv Detail & Related papers (2021-05-12T09:30:26Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - Test-Time Adaptable Neural Networks for Robust Medical Image
Segmentation [9.372152932156293]
Convolutional Neural Networks (CNNs) work very well for supervised learning problems.
In medical image segmentation, this premise is violated when there is a mismatch between training and test images in terms of their acquisition details.
We design the segmentation CNN as a concatenation of two sub-networks: a relatively shallow image normalization CNN, followed by a deep CNN that segments the normalized image.
arXiv Detail & Related papers (2020-04-09T16:57:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.