Lung Disease Detection with Vision Transformers: A Comparative Study of Machine Learning Methods
- URL: http://arxiv.org/abs/2411.11376v1
- Date: Mon, 18 Nov 2024 08:40:25 GMT
- Title: Lung Disease Detection with Vision Transformers: A Comparative Study of Machine Learning Methods
- Authors: Baljinnyam Dayan,
- Abstract summary: This study explores the application of Vision Transformers (ViT), a state-of-the-art architecture in machine learning, to chest X-ray analysis.
I present a comparative analysis of two ViT-based approaches: one utilizing full chest X-ray images and another focusing on segmented lung regions.
- Score: 0.0
- License:
- Abstract: Recent advancements in medical image analysis have predominantly relied on Convolutional Neural Networks (CNNs), achieving impressive performance in chest X-ray classification tasks, such as the 92% AUC reported by AutoThorax-Net and the 88% AUC achieved by ChexNet in classifcation tasks. However, in the medical field, even small improvements in accuracy can have significant clinical implications. This study explores the application of Vision Transformers (ViT), a state-of-the-art architecture in machine learning, to chest X-ray analysis, aiming to push the boundaries of diagnostic accuracy. I present a comparative analysis of two ViT-based approaches: one utilizing full chest X-ray images and another focusing on segmented lung regions. Experiments demonstrate that both methods surpass the performance of traditional CNN-based models, with the full-image ViT achieving up to 97.83% accuracy and the lung-segmented ViT reaching 96.58% accuracy in classifcation of diseases on three label and AUC of 94.54% when label numbers are increased to eight. Notably, the full-image approach showed superior performance across all metrics, including precision, recall, F1 score, and AUC-ROC. These findings suggest that Vision Transformers can effectively capture relevant features from chest X-rays without the need for explicit lung segmentation, potentially simplifying the preprocessing pipeline while maintaining high accuracy. This research contributes to the growing body of evidence supporting the efficacy of transformer-based architectures in medical image analysis and highlights their potential to enhance diagnostic precision in clinical settings.
Related papers
- InfLocNet: Enhanced Lung Infection Localization and Disease Detection from Chest X-Ray Images Using Lightweight Deep Learning [0.5242869847419834]
This paper presents a novel, lightweight deep learning based segmentation-classification network.
It is designed to enhance the detection and localization of lung infections using chest X-ray images.
Our model achieves remarkable results with an Intersection over Union (IoU) of 93.59% and a Dice Similarity Coefficient (DSC) of 97.61% in lung area segmentation.
arXiv Detail & Related papers (2024-08-12T19:19:23Z) - Swin-Tempo: Temporal-Aware Lung Nodule Detection in CT Scans as Video
Sequences Using Swin Transformer-Enhanced UNet [2.7547288571938795]
We present an innovative model that harnesses the strengths of both convolutional neural networks and vision transformers.
Inspired by object detection in videos, we treat each 3D CT image as a video, individual slices as frames, and lung nodules as objects, enabling a time-series application.
arXiv Detail & Related papers (2023-10-05T07:48:55Z) - HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic
Joint Infection Diagnosis Using CT Images and Text [0.0]
Prosthetic Joint Infection (PJI) is a prevalent and severe complication.
Currently, a unified diagnostic standard incorporating both computed tomography (CT) images and numerical text data for PJI remains unestablished.
This study introduces a diagnostic method, HGT, based on deep learning and multimodal techniques.
arXiv Detail & Related papers (2023-05-29T11:25:57Z) - Attention-based Saliency Maps Improve Interpretability of Pneumothorax
Classification [52.77024349608834]
To investigate chest radiograph (CXR) classification performance of vision transformers (ViT) and interpretability of attention-based saliency.
ViTs were fine-tuned for lung disease classification using four public data sets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData.
ViTs had comparable CXR classification AUCs compared with state-of-the-art CNNs.
arXiv Detail & Related papers (2023-03-03T12:05:41Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - Efficient Lung Cancer Image Classification and Segmentation Algorithm
Based on Improved Swin Transformer [0.0]
transformer model has been applied to the field of computer vision (CV) after its success in natural language processing (NLP)
This paper creatively proposes a segmentation method based on efficient transformer and applies it to medical image analysis.
The algorithm completes the task of lung cancer classification and segmentation by analyzing lung cancer data, and aims to provide efficient technical support for medical staff.
arXiv Detail & Related papers (2022-07-04T15:50:06Z) - EMT-NET: Efficient multitask network for computer-aided diagnosis of
breast cancer [58.720142291102135]
We propose an efficient and light-weighted learning architecture to classify and segment breast tumors simultaneously.
We incorporate a segmentation task into a tumor classification network, which makes the backbone network learn representations focused on tumor regions.
The accuracy, sensitivity, and specificity of tumor classification is 88.6%, 94.1%, and 85.3%, respectively.
arXiv Detail & Related papers (2022-01-13T05:24:40Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - Chest X-ray Image Phase Features for Improved Diagnosis of COVID-19
Using Convolutional Neural Network [2.752817022620644]
Recent research has shown radiography of COVID-19 patient contains salient information about the COVID-19 virus.
Chest X-ray (CXR) due to its faster imaging time, wide availability, low cost and portability gains much attention.
In this study, we design a novel multi-feature convolutional neural network (CNN) architecture for improved classification of COVID-19 from CXR images.
arXiv Detail & Related papers (2020-11-06T20:26:26Z) - A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced
Cardiac Magnetic Resonance Imaging [90.29017019187282]
" 2018 Left Atrium Challenge" using 154 3D LGE-MRIs, currently the world's largest cardiac LGE-MRI dataset.
Analyse of the submitted algorithms using technical and biological metrics was performed.
Results show the top method achieved a dice score of 93.2% and a mean surface to a surface distance of 0.7 mm.
arXiv Detail & Related papers (2020-04-26T08:49:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.