Evaluation of Vision Transformers for Multimodal Image Classification: A Case Study on Brain, Lung, and Kidney Tumors
- URL: http://arxiv.org/abs/2502.05517v1
- Date: Sat, 08 Feb 2025 10:35:51 GMT
- Title: Evaluation of Vision Transformers for Multimodal Image Classification: A Case Study on Brain, Lung, and Kidney Tumors
- Authors: Óscar A. Martín, Javier Sánchez,
- Abstract summary: This work evaluates the performance of Vision Transformers architectures, including Swin Transformer and MaxViT, in several datasets.
We used three training sets of images with brain, lung, and kidney tumors.
Swin Transformer provided high accuracy, achieving up to 99.9% for kidney tumor classification and 99.3% accuracy in a combined dataset.
- Score: 0.0
- License:
- Abstract: Neural networks have become the standard technique for medical diagnostics, especially in cancer detection and classification. This work evaluates the performance of Vision Transformers architectures, including Swin Transformer and MaxViT, in several datasets of magnetic resonance imaging (MRI) and computed tomography (CT) scans. We used three training sets of images with brain, lung, and kidney tumors. Each dataset includes different classification labels, from brain gliomas and meningiomas to benign and malignant lung conditions and kidney anomalies such as cysts and cancers. This work aims to analyze the behavior of the neural networks in each dataset and the benefits of combining different image modalities and tumor classes. We designed several experiments by fine-tuning the models on combined and individual image modalities. The results revealed that the Swin Transformer provided high accuracy, achieving up to 99.9\% for kidney tumor classification and 99.3\% accuracy in a combined dataset. MaxViT also provided excellent results in individual datasets but performed poorly when data is combined. This research highlights the adaptability of Transformer-based models to various image modalities and features. However, challenges persist, including limited annotated data and interpretability issues. Future works will expand this study by incorporating other image modalities and enhancing diagnostic capabilities. Integrating these models across diverse datasets could mark a pivotal advance in precision medicine, paving the way for more efficient and comprehensive healthcare solutions.
Related papers
- An Ensemble Approach for Brain Tumor Segmentation and Synthesis [0.12777007405746044]
The integration of machine learning in magnetic resonance imaging (MRI) is proving to be incredibly effective.
Deep learning models utilize multiple layers of processing to capture intricate details of complex data.
We propose a deep learning framework that ensembles state-of-the-art architectures to achieve accurate segmentation.
arXiv Detail & Related papers (2024-11-26T17:28:51Z) - MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging [16.325123491357203]
We propose a multimodal pre-training framework that jointly incorporates genomics and medical images for downstream tasks.
We align medical images and genes using a self-supervised contrastive learning approach which combines the Mamba as a genetic encoder and the Vision Transformer (ViT) as a medical image encoder.
arXiv Detail & Related papers (2024-06-02T06:20:45Z) - Cross-Modal Domain Adaptation in Brain Disease Diagnosis: Maximum Mean Discrepancy-based Convolutional Neural Networks [0.0]
Brain disorders are a major challenge to global health, causing millions of deaths each year.
Accurate diagnosis of these diseases relies heavily on advanced medical imaging techniques such as MRI and CT.
The scarcity of annotated data poses a significant challenge in deploying machine learning models for medical diagnosis.
arXiv Detail & Related papers (2024-05-06T07:44:46Z) - QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge [93.61262892578067]
Uncertainty in medical image segmentation tasks, especially inter-rater variability, presents a significant challenge.
This variability directly impacts the development and evaluation of automated segmentation algorithms.
We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ)
arXiv Detail & Related papers (2024-03-19T17:57:24Z) - Automated ensemble method for pediatric brain tumor segmentation [0.0]
This study introduces a novel ensemble approach using ONet and modified versions of UNet.
Data augmentation ensures robustness and accuracy across different scanning protocols.
Results indicate that this advanced ensemble approach offers promising prospects for enhanced diagnostic accuracy.
arXiv Detail & Related papers (2023-08-14T15:29:32Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Deep Learning models for benign and malign Ocular Tumor Growth
Estimation [3.1558405181807574]
Clinicians often face issues in selecting suitable image processing algorithm for medical imaging data.
A strategy for the selection of a proper model is presented here.
arXiv Detail & Related papers (2021-07-09T05:40:25Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - SAG-GAN: Semi-Supervised Attention-Guided GANs for Data Augmentation on
Medical Images [47.35184075381965]
We present a data augmentation method for generating synthetic medical images using cycle-consistency Generative Adversarial Networks (GANs)
The proposed GANs-based model can generate a tumor image from a normal image, and in turn, it can also generate a normal image from a tumor image.
We train the classification model using real images with classic data augmentation methods and classification models using synthetic images.
arXiv Detail & Related papers (2020-11-15T14:01:24Z) - Improved Slice-wise Tumour Detection in Brain MRIs by Computing
Dissimilarities between Latent Representations [68.8204255655161]
Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods.
We have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder.
We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines.
arXiv Detail & Related papers (2020-07-24T14:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.