An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis
- URL: http://arxiv.org/abs/2404.00144v1
- Date: Fri, 29 Mar 2024 20:32:30 GMT
- Title: An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis
- Authors: Ziyu Zhou, Anton Orlichenko, Gang Qu, Zening Fu, Vince D Calhoun, Zhengming Ding, Yu-Ping Wang,
- Abstract summary: We propose a novel Cross-Attentive Multi-modal Fusion framework (CAMF) to capture both intra-modal and inter-modal relationships between fMRI and sMRI.
Our approach significantly improves classification accuracy, as demonstrated by our evaluations on two extensive multi-modal brain imaging datasets.
The gradient-guided Score-CAM is applied to interpret critical functional networks and brain regions involved in schizophrenia.
- Score: 46.58592655409785
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Both functional and structural magnetic resonance imaging (fMRI and sMRI) are widely used for the diagnosis of mental disorder. However, combining complementary information from these two modalities is challenging due to their heterogeneity. Many existing methods fall short of capturing the interaction between these modalities, frequently defaulting to a simple combination of latent features. In this paper, we propose a novel Cross-Attentive Multi-modal Fusion framework (CAMF), which aims to capture both intra-modal and inter-modal relationships between fMRI and sMRI, enhancing multi-modal data representation. Specifically, our CAMF framework employs self-attention modules to identify interactions within each modality while cross-attention modules identify interactions between modalities. Subsequently, our approach optimizes the integration of latent features from both modalities. This approach significantly improves classification accuracy, as demonstrated by our evaluations on two extensive multi-modal brain imaging datasets, where CAMF consistently outperforms existing methods. Furthermore, the gradient-guided Score-CAM is applied to interpret critical functional networks and brain regions involved in schizophrenia. The bio-markers identified by CAMF align with established research, potentially offering new insights into the diagnosis and pathological endophenotypes of schizophrenia.
Related papers
- A Unified Framework for Synthesizing Multisequence Brain MRI via Hybrid Fusion [4.47838172826189]
We propose a novel unified framework for synthesizing multisequence MR images, called Hybrid Fusion GAN (HF-GAN)
We introduce a hybrid fusion encoder designed to ensure the disentangled extraction of complementary and modality-specific information.
Common feature representations are transformed into a target latent space via the modality infuser to synthesize missing MR sequences.
arXiv Detail & Related papers (2024-06-21T08:06:00Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - Cross-modality Guidance-aided Multi-modal Learning with Dual Attention
for MRI Brain Tumor Grading [47.50733518140625]
Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly.
We propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading.
arXiv Detail & Related papers (2024-01-17T07:54:49Z) - HA-HI: Synergising fMRI and DTI through Hierarchical Alignments and
Hierarchical Interactions for Mild Cognitive Impairment Diagnosis [10.028997265879598]
We introduce a novel Hierarchical Alignments and Hierarchical Interactions (HA-HI) method for diagnosis of mild cognitive impairment (MCI) and subjective cognitive decline (SCD)
HA-HI efficiently learns significant MCI- or SCD- related regional and connectivity features by aligning various feature types and hierarchically maximizing their interactions.
To enhance the interpretability of our approach, we have developed the Synergistic Activation Map (SAM) technique, revealing the critical brain regions and connections that are indicative of MCI/SCD.
arXiv Detail & Related papers (2024-01-02T12:46:02Z) - Multi-Dimension-Embedding-Aware Modality Fusion Transformer for
Psychiatric Disorder Clasification [13.529183496842819]
We construct a deep learning architecture that takes as input 2D time series of rs-fMRI and 3D volumes T1w.
We show that our proposed MFFormer performs better than that using a single modality or multi-modality MRI on schizophrenia and bipolar disorder diagnosis.
arXiv Detail & Related papers (2023-10-04T10:02:04Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale.
Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z) - Mapping individual differences in cortical architecture using multi-view
representation learning [0.0]
We introduce a novel machine learning method which allows combining the activation-and connectivity-based information respectively measured through task-fMRI and resting-state fMRI.
It combines a multi-view deep autoencoder which is designed to fuse the two fMRI modalities into a joint representation space within which a predictive model is trained to guess a scalar score that characterizes the patient.
arXiv Detail & Related papers (2020-04-01T09:01:25Z) - Meta-modal Information Flow: A Method for Capturing Multimodal Modular
Disconnectivity in Schizophrenia [11.100316178148994]
We introduce a method that takes advantage of multimodal data in addressing the hypotheses of disconnectivity and dysfunction within schizophrenia (SZ)
We propose a modularity-based method that can be applied to the GGM to identify links that are associated with mental illness across a multimodal data set.
Through simulation and real data, we show our approach reveals important information about disease-related network disruptions that are missed with a focus on a single modality.
arXiv Detail & Related papers (2020-01-06T18:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.