Multi-Feature Vision Transformer via Self-Supervised Representation
Learning for Improvement of COVID-19 Diagnosis
- URL: http://arxiv.org/abs/2208.01843v1
- Date: Wed, 3 Aug 2022 05:02:47 GMT
- Title: Multi-Feature Vision Transformer via Self-Supervised Representation
Learning for Improvement of COVID-19 Diagnosis
- Authors: Xiao Qi, David J. Foran, John L. Nosher, Ilker Hacihaliloglu
- Abstract summary: We study the effectiveness of self-supervised learning in the context of diagnosing COVID-19 disease from CXR images.
We deploy a cross-attention mechanism to learn information from both original CXR images and corresponding enhanced local phase CXR images.
We demonstrate the performance of the baseline self-supervised learning models can be further improved by leveraging the local phase-based enhanced CXR images.
- Score: 2.3513645401551333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The role of chest X-ray (CXR) imaging, due to being more cost-effective,
widely available, and having a faster acquisition time compared to CT, has
evolved during the COVID-19 pandemic. To improve the diagnostic performance of
CXR imaging a growing number of studies have investigated whether supervised
deep learning methods can provide additional support. However, supervised
methods rely on a large number of labeled radiology images, which is a
time-consuming and complex procedure requiring expert clinician input. Due to
the relative scarcity of COVID-19 patient data and the costly labeling process,
self-supervised learning methods have gained momentum and has been proposed
achieving comparable results to fully supervised learning approaches. In this
work, we study the effectiveness of self-supervised learning in the context of
diagnosing COVID-19 disease from CXR images. We propose a multi-feature Vision
Transformer (ViT) guided architecture where we deploy a cross-attention
mechanism to learn information from both original CXR images and corresponding
enhanced local phase CXR images. We demonstrate the performance of the baseline
self-supervised learning models can be further improved by leveraging the local
phase-based enhanced CXR images. By using 10\% labeled CXR scans, the proposed
model achieves 91.10\% and 96.21\% overall accuracy tested on total 35,483 CXR
images of healthy (8,851), regular pneumonia (6,045), and COVID-19 (18,159)
scans and shows significant improvement over state-of-the-art techniques. Code
is available https://github.com/endiqq/Multi-Feature-ViT
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.