DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PET
- URL: http://arxiv.org/abs/2410.23219v1
- Date: Wed, 30 Oct 2024 17:11:00 GMT
- Title: DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PET
- Authors: Yitong Li, Morteza Ghahremani, Youssef Wally, Christian Wachinger,
- Abstract summary: We propose a novel framework, DiaMond, to integrate MRI and PET.
DiaMond is equipped with self-attention and a novel bi-attention mechanism that synergistically combine MRI and PET.
It significantly outperforms existing multi-modal methods across various datasets.
- Score: 9.229658208994675
- License:
- Abstract: Diagnosing dementia, particularly for Alzheimer's Disease (AD) and frontotemporal dementia (FTD), is complex due to overlapping symptoms. While magnetic resonance imaging (MRI) and positron emission tomography (PET) data are critical for the diagnosis, integrating these modalities in deep learning faces challenges, often resulting in suboptimal performance compared to using single modalities. Moreover, the potential of multi-modal approaches in differential diagnosis, which holds significant clinical importance, remains largely unexplored. We propose a novel framework, DiaMond, to address these issues with vision Transformers to effectively integrate MRI and PET. DiaMond is equipped with self-attention and a novel bi-attention mechanism that synergistically combine MRI and PET, alongside a multi-modal normalization to reduce redundant dependency, thereby boosting the performance. DiaMond significantly outperforms existing multi-modal methods across various datasets, achieving a balanced accuracy of 92.4% in AD diagnosis, 65.2% for AD-MCI-CN classification, and 76.5% in differential diagnosis of AD and FTD. We also validated the robustness of DiaMond in a comprehensive ablation study. The code is available at https://github.com/ai-med/DiaMond.
Related papers
- MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models [49.765466293296186]
Recent progress in Medical Large Vision-Language Models (Med-LVLMs) has opened up new possibilities for interactive diagnostic tools.
Med-LVLMs often suffer from factual hallucination, which can lead to incorrect diagnoses.
We propose a versatile multimodal RAG system, MMed-RAG, designed to enhance the factuality of Med-LVLMs.
arXiv Detail & Related papers (2024-10-16T23:03:27Z) - GFE-Mamba: Mamba-based AD Multi-modal Progression Assessment via Generative Feature Extraction from MCI [5.355943545567233]
Alzheimer's Disease (AD) is an irreversible neurodegenerative disorder that often progresses from Mild Cognitive Impairment (MCI)
We introduce GFE-Mamba, a classifier based on Generative Feature Extraction (GFE)
It integrates data from assessment scales, MRI, and PET, enabling deeper multimodal fusion.
Our experimental results demonstrate that the GFE-Mamba model is effective in predicting the conversion from MCI to AD.
arXiv Detail & Related papers (2024-07-22T15:22:33Z) - Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks [40.986069119392944]
We propose MX-ARM, a multimodal MiXture-of-experts Alignment Reconstruction and Model.
It is modality detachable and exchangeable, allocating different multi-layer perceptrons dynamically ("mixture of experts") through learnable weights to learn respective representations from different modalities.
arXiv Detail & Related papers (2024-03-29T08:47:49Z) - Cross-modality Guidance-aided Multi-modal Learning with Dual Attention
for MRI Brain Tumor Grading [47.50733518140625]
Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly.
We propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading.
arXiv Detail & Related papers (2024-01-17T07:54:49Z) - A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar
Disorders [11.622160966334745]
We utilize both MRIs and fMRI data and propose a multimodal diagnosis model for bipolar disorder.
Our proposed method outperforms others in balanced accuracy from 0.657 to 0.732 on the OpenfMRI dataset.
arXiv Detail & Related papers (2024-01-15T10:11:19Z) - Unsupervised Anomaly Detection using Aggregated Normative Diffusion [46.24703738821696]
Unsupervised anomaly detection has the potential to identify a broader spectrum of anomalies.
Existing state-of-the-art UAD approaches do not generalise well to diverse types of anomalies.
We introduce a new UAD method named Aggregated Normative Diffusion (ANDi)
arXiv Detail & Related papers (2023-12-04T14:02:56Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Tensor-Based Multi-Modality Feature Selection and Regression for
Alzheimer's Disease Diagnosis [25.958167380664083]
We propose a novel tensor-based multi-modality feature selection and regression method for diagnosis and biomarker identification of Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI)
We present the practical advantages of our method for the analysis of ADNI data using three imaging modalities.
arXiv Detail & Related papers (2022-09-23T02:17:27Z) - Is a PET all you need? A multi-modal study for Alzheimer's disease using
3D CNNs [3.678164468512092]
Alzheimer's Disease (AD) is the most common form of dementia and often difficult to diagnose due to the multifactorial etiology of dementia.
Recent works on neuroimaging-based computer-aided diagnosis with deep neural networks (DNNs) showed that fusing structural magnetic resonance images (sMRI) and fluorodeoxyglucose positron emission tomography (FDG-PET) leads to improved accuracy in a study population of healthy controls and subjects with AD.
We argue that future work on multi-modal fusion should systematically assess the contribution of individual modalities following our proposed evaluation framework.
arXiv Detail & Related papers (2022-07-05T14:55:56Z) - Subgroup discovery of Parkinson's Disease by utilizing a multi-modal
smart device system [63.20765930558542]
We used smartwatches and smartphones to collect multi-modal data from 504 participants, including PD patients, DD and HC.
We were able to show that by combining various modalities, classification accuracy improved and further PD clusters were discovered.
arXiv Detail & Related papers (2022-05-12T08:59:57Z) - Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents.
We generate an automatic tumor boundary detector for the rare disease of glioblastoma.
We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.