Related papers: Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection

Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection

URL: http://arxiv.org/abs/2303.08216v1
Date: Tue, 14 Mar 2023 20:18:12 GMT
Title: Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection
Authors: Nikhil J. Dhinagar, Sophia I. Thomopoulos, Emily Laltoo and Paul M. Thompson
Abstract summary: Vision transformers (ViT) have emerged in recent years as an alternative to CNNs for several computer vision applications. We tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic and real MRI scans.
Score: 2.359557447960552
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning architectures - have emerged in recent years as an alternative to CNNs for several computer vision applications. Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty, in this case for sex and Alzheimer's disease (AD) classification based on 3D brain MRI. In our experiments, two vision transformer architecture variants achieved an AUC of 0.987 for sex and 0.892 for AD classification, respectively. We independently evaluated our models on data from two benchmark AD datasets. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic (generated by a latent diffusion model) and real MRI scans, respectively. Our main contributions include testing the effects of different ViT training strategies including pre-training, data augmentation and learning rate warm-ups followed by annealing, as pertaining to the neuroimaging domain. These techniques are essential for training ViT-like models for neuroimaging applications where training data is usually limited. We also analyzed the effect of the amount of training data utilized on the test-time performance of the ViT via data-model scaling curves.

Related papers

Comparative Analysis of Deep Learning Strategies for Hypertensive Retinopathy Detection from Fundus Images: From Scratch and Pre-trained Models [5.860609259063137]
This paper presents a comparative analysis of deep learning strategies for detecting hypertensive retinopathy from fundus images.<n>We investigate three distinct approaches: a custom CNN, a suite of pre-trained transformer-based models, and an AutoML solution.
arXiv Detail & Related papers (2025-06-14T13:11:33Z)
Towards a general-purpose foundation model for fMRI analysis [58.06455456423138]
We introduce NeuroSTORM, a framework that learns from 4D fMRI volumes and enables efficient knowledge transfer across diverse applications.<n>NeuroSTORM is pre-trained on 28.65 million fMRI frames (>9,000 hours) from over 50,000 subjects across multiple centers and ages 5 to 100.<n>It outperforms existing methods across five tasks: age/gender prediction, phenotype prediction, disease diagnosis, fMRI-to-image retrieval, and task-based fMRI.
arXiv Detail & Related papers (2025-06-11T23:51:01Z)
Stroke classification using Virtual Hybrid Edge Detection from in silico electrical impedance tomography data [37.94947316140192]
Electrical impedance tomography (EIT) is a non-invasive imaging method for recovering the internal conductivity of a physical body. Most previous works have used raw EIT voltage data as network inputs. We build upon a recent development which suggested the use of special noise-robust Virtual Hybrid Edge Detection (VHED) functions as network inputs.
arXiv Detail & Related papers (2025-01-24T18:29:34Z)
Residual Vision Transformer (ResViT) Based Self-Supervised Learning Model for Brain Tumor Classification [0.08192907805418585]
Self-supervised learning models provide data-efficient and remarkable solutions to limited dataset problems. This paper introduces a generative SSL model for brain tumor classification in two stages. The proposed model attains the highest accuracy, achieving 90.56% on the BraTs dataset with T1 sequence, 98.53% on the Figshare, and 98.47% on the Kaggle brain tumor datasets.
arXiv Detail & Related papers (2024-11-19T21:42:57Z)
Self-Supervised Pre-training Tasks for an fMRI Time-series Transformer in Autism Detection [3.665816629105171]
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that encompasses a wide variety of symptoms and degrees of impairment. We have developed a transformer-based self-supervised framework that directly analyzes time-series fMRI data without computing functional connectivity. We show that randomly masking entire ROIs gives better model performance than randomly masking time points in the pre-training step.
arXiv Detail & Related papers (2024-09-18T20:29:23Z)
Phikon-v2, A large and public feature extractor for biomarker prediction [42.52549987351643]
We train a vision transformer using DINOv2 and publicly release one iteration of this model for further experimentation, coined Phikon-v2. While trained on publicly available histology slides, Phikon-v2 surpasses our previously released model (Phikon) and performs on par with other histopathology foundation models (FM) trained on proprietary data.
arXiv Detail & Related papers (2024-09-13T20:12:29Z)
Self-Supervised Pretext Tasks for Alzheimer's Disease Classification using 3D Convolutional Neural Networks on Large-Scale Synthetic Neuroimaging Dataset [11.173478552040441]
Alzheimer's Disease (AD) induces both localised and widespread neural degenerative changes throughout the brain. In this work, we evaluated several unsupervised methods to train a feature extractor for downstream AD vs. CN classification.
arXiv Detail & Related papers (2024-06-20T11:26:32Z)
Virchow: A Million-Slide Digital Pathology Foundation Model [34.38679208931425]
We present Virchow, a foundation model for computational pathology. Virchow is a vision transformer model with 632 million parameters trained on 1.5 million hematoxylin and eosin stained whole slide images.
arXiv Detail & Related papers (2023-09-14T15:09:35Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
Video and Synthetic MRI Pre-training of 3D Vision Architectures for Neuroimage Analysis [3.208731414009847]
Transfer learning involves pre-training deep learning models on a large corpus of data for adaptation to specific tasks. We benchmarked vision transformers (ViTs) and convolutional neural networks (CNNs) with varied upstream pre-training approaches. The resulting pre-trained models can be adapted to a range of downstream tasks, even when training data for the target task is limited.
arXiv Detail & Related papers (2023-09-09T00:33:23Z)
PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis [39.16619345610152]
We propose PTGB, a GNN pre-training framework that captures intrinsic brain network structures, regardless of clinical outcomes, and is easily adaptable to various downstream tasks. PTGB comprises two key components: (1) an unsupervised pre-training technique designed specifically for brain networks, which enables learning from large-scale datasets without task-specific labels; (2) a data-driven parcellation atlas mapping pipeline that facilitates knowledge transfer across datasets with different ROI systems.
arXiv Detail & Related papers (2023-05-20T21:07:47Z)
Evaluating U-net Brain Extraction for Multi-site and Longitudinal Preclinical Stroke Imaging [0.4310985013483366]
Convolutional neural networks (CNNs) can improve accuracy and reduce operator time. We developed a deep-learning mouse brain extraction tool by using a U-net CNN. We trained, validated, and tested a typical U-net model on 240 multimodal MRI datasets.
arXiv Detail & Related papers (2022-03-11T02:00:27Z)
Functional Magnetic Resonance Imaging data augmentation through conditional ICA [44.483210864902304]
We introduce Conditional Independent Components Analysis (Conditional ICA): a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique. We show that Conditional ICA is successful at synthesizing data indistinguishable from observations, and that it yields gains in classification accuracy in brain decoding problems.
arXiv Detail & Related papers (2021-07-11T22:36:14Z)
Classification of COVID-19 in CT Scans using Multi-Source Transfer Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans. With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet. Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
Comparing SNNs and RNNs on Neuromorphic Vision Datasets: Similarities and Differences [36.82069150045153]
Spiking neural networks (SNNs) and recurrent neural networks (RNNs) are benchmarked on neuromorphic data. In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data.
arXiv Detail & Related papers (2020-05-02T10:19:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.