A Recent Survey of Vision Transformers for Medical Image Segmentation
- URL: http://arxiv.org/abs/2312.00634v2
- Date: Tue, 19 Dec 2023 03:49:48 GMT
- Title: A Recent Survey of Vision Transformers for Medical Image Segmentation
- Authors: Asifullah Khan, Zunaira Rauf, Abdul Rehman Khan, Saima Rathore, Saddam
Hussain Khan, Najmus Saher Shah, Umair Farooq, Hifsa Asif, Aqsa Asif, Umme
Zahoora, Rafi Ullah Khalil, Suleman Qamar, Umme Hani Asif, Faiza Babar Khan,
Abdul Majid and Jeonghwan Gwak
- Abstract summary: Vision Transformers (ViTs) have emerged as a promising technique for addressing the challenges in medical image segmentation.
Their multi-scale attention mechanism enables effective modeling of long-range dependencies between distant structures.
Recently, researchers have come up with various ViT-based approaches that incorporate CNNs in their architectures, known as Hybrid Vision Transformers (HVTs)
- Score: 2.4895533667182703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical image segmentation plays a crucial role in various healthcare
applications, enabling accurate diagnosis, treatment planning, and disease
monitoring. Traditionally, convolutional neural networks (CNNs) dominated this
domain, excelling at local feature extraction. However, their limitations in
capturing long-range dependencies across image regions pose challenges for
segmenting complex, interconnected structures often encountered in medical
data. In recent years, Vision Transformers (ViTs) have emerged as a promising
technique for addressing the challenges in medical image segmentation. Their
multi-scale attention mechanism enables effective modeling of long-range
dependencies between distant structures, crucial for segmenting organs or
lesions spanning the image. Additionally, ViTs' ability to discern subtle
pattern heterogeneity allows for the precise delineation of intricate
boundaries and edges, a critical aspect of accurate medical image segmentation.
However, they do lack image-related inductive bias and translational
invariance, potentially impacting their performance. Recently, researchers have
come up with various ViT-based approaches that incorporate CNNs in their
architectures, known as Hybrid Vision Transformers (HVTs) to capture local
correlation in addition to the global information in the images. This survey
paper provides a detailed review of the recent advancements in ViTs and HVTs
for medical image segmentation. Along with the categorization of ViT and
HVT-based medical image segmentation approaches, we also present a detailed
overview of their real-time applications in several medical image modalities.
This survey may serve as a valuable resource for researchers, healthcare
practitioners, and students in understanding the state-of-the-art approaches
for ViT-based medical image segmentation.
Related papers
- Med-TTT: Vision Test-Time Training model for Medical Image Segmentation [5.318153305245246]
We propose Med-TTT, a visual backbone network integrated with Test-Time Training layers.
The model achieves leading performance in terms of accuracy, sensitivity, and Dice coefficient.
arXiv Detail & Related papers (2024-10-03T14:29:46Z) - Scribble-Based Interactive Segmentation of Medical Hyperspectral Images [4.675955891956077]
This work introduces a scribble-based interactive segmentation framework for medical hyperspectral images.
The proposed method utilizes deep learning for feature extraction and a geodesic distance map generated from user-provided scribbles.
arXiv Detail & Related papers (2024-08-05T12:33:07Z) - Advancing Medical Image Segmentation: Morphology-Driven Learning with Diffusion Transformer [4.672688418357066]
We propose a novel Transformer Diffusion (DTS) model for robust segmentation in the presence of noise.
Our model, which analyzes the morphological representation of images, shows better results than the previous models in various medical imaging modalities.
arXiv Detail & Related papers (2024-08-01T07:35:54Z) - From CNN to Transformer: A Review of Medical Image Segmentation Models [7.3150850275578145]
Deep learning for medical image segmentation has become a prevalent trend.
In this paper, we conduct a survey of the most representative four medical image segmentation models in recent years.
We theoretically analyze the characteristics of these models and quantitatively evaluate their performance on two benchmark datasets.
arXiv Detail & Related papers (2023-08-10T02:48:57Z) - A hybrid approach for improving U-Net variants in medical image
segmentation [0.0]
The technique of splitting a medical image into various segments or regions of interest is known as medical image segmentation.
The segmented images that are produced can be used for many different things, including diagnosis, surgery planning, and therapy evaluation.
This research aims to reduce the network parameter requirements using depthwise separable convolutions.
arXiv Detail & Related papers (2023-07-31T07:43:45Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - AlignTransformer: Hierarchical Alignment of Visual Regions and Disease
Tags for Medical Report Generation [50.21065317817769]
We propose an AlignTransformer framework, which includes the Align Hierarchical Attention (AHA) and the Multi-Grained Transformer (MGT) modules.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that the AlignTransformer can achieve results competitive with state-of-the-art methods on the two datasets.
arXiv Detail & Related papers (2022-03-18T13:43:53Z) - Transformers in Medical Imaging: A Survey [88.03790310594533]
Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results.
Medical imaging has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields.
We provide a review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues.
arXiv Detail & Related papers (2022-01-24T18:50:18Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.