Related papers: An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis

An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis

URL: http://arxiv.org/abs/2404.00144v1
Date: Fri, 29 Mar 2024 20:32:30 GMT
Title: An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis
Authors: Ziyu Zhou, Anton Orlichenko, Gang Qu, Zening Fu, Vince D Calhoun, Zhengming Ding, Yu-Ping Wang,
Abstract summary: We propose a novel Cross-Attentive Multi-modal Fusion framework (CAMF) to capture both intra-modal and inter-modal relationships between fMRI and sMRI. Our approach significantly improves classification accuracy, as demonstrated by our evaluations on two extensive multi-modal brain imaging datasets. The gradient-guided Score-CAM is applied to interpret critical functional networks and brain regions involved in schizophrenia.
Score: 46.58592655409785
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Both functional and structural magnetic resonance imaging (fMRI and sMRI) are widely used for the diagnosis of mental disorder. However, combining complementary information from these two modalities is challenging due to their heterogeneity. Many existing methods fall short of capturing the interaction between these modalities, frequently defaulting to a simple combination of latent features. In this paper, we propose a novel Cross-Attentive Multi-modal Fusion framework (CAMF), which aims to capture both intra-modal and inter-modal relationships between fMRI and sMRI, enhancing multi-modal data representation. Specifically, our CAMF framework employs self-attention modules to identify interactions within each modality while cross-attention modules identify interactions between modalities. Subsequently, our approach optimizes the integration of latent features from both modalities. This approach significantly improves classification accuracy, as demonstrated by our evaluations on two extensive multi-modal brain imaging datasets, where CAMF consistently outperforms existing methods. Furthermore, the gradient-guided Score-CAM is applied to interpret critical functional networks and brain regions involved in schizophrenia. The bio-markers identified by CAMF align with established research, potentially offering new insights into the diagnosis and pathological endophenotypes of schizophrenia.

Related papers

NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding [51.63264715941068]
textbfNEARL-CLIP (iunderlineNteracted quunderlineEry underlineAdaptation with ounderlineRthogonaunderlineL Regularization) is a novel cross-modality interaction VLM-based framework.
arXiv Detail & Related papers (2025-08-06T05:44:01Z)
Unified Cross-Modal Attention-Mixer Based Structural-Functional Connectomics Fusion for Neuropsychiatric Disorder Diagnosis [17.40353435750778]
ConneX is a multimodal fusion method that integrates cross-attention mechanism and multilayer perceptron (MLP)-Mixer for refined feature fusion.<n>We show improved performance on two distinct clinical datasets, highlighting the robustness of our proposed framework.
arXiv Detail & Related papers (2025-05-21T05:49:13Z)
4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis [24.771496672135395]
We propose M2M-AlignNet: a geometry-aware co-attention network with latent alignment for early Alzheimer's diagnosis. At the core of our approach is a multi-patch-to-multi-patch (M2M) contrastive loss function that quantifies and reduces representational discrepancies. We conduct extensive experiments to confirm the effectiveness of our method and highlight the correspondance between fMRI and sMRI as AD biomarkers.
arXiv Detail & Related papers (2025-04-23T15:18:55Z)
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [52.106879463828044]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease. We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention. Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z)
Integrated Brain Connectivity Analysis with fMRI, DTI, and sMRI Powered by Interpretable Graph Neural Networks [17.063133885403154]
We integrate functional magnetic resonance imaging, diffusion tensor imaging, and structural MRI into a cohesive framework. Our approach incorporates a masking strategy to differentially weight neural connections, thereby facilitating a holistic amalgamation of multimodal imaging data. The model is applied to the Human Connectome Project's Development study to elucidate the associations between multimodal imaging and cognitive functions throughout youth.
arXiv Detail & Related papers (2024-08-26T13:16:42Z)
MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer. This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation. Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z)
Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model. We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE) This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z)
Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems. This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z)
Cross-modality Guidance-aided Multi-modal Learning with Dual Attention for MRI Brain Tumor Grading [47.50733518140625]
Brain tumor represents one of the most fatal cancers around the world, and is very common in children and the elderly. We propose a novel cross-modality guidance-aided multi-modal learning with dual attention for addressing the task of MRI brain tumor grading.
arXiv Detail & Related papers (2024-01-17T07:54:49Z)
HA-HI: Synergising fMRI and DTI through Hierarchical Alignments and Hierarchical Interactions for Mild Cognitive Impairment Diagnosis [10.028997265879598]
We introduce a novel Hierarchical Alignments and Hierarchical Interactions (HA-HI) method for diagnosis of mild cognitive impairment (MCI) and subjective cognitive decline (SCD) HA-HI efficiently learns significant MCI- or SCD- related regional and connectivity features by aligning various feature types and hierarchically maximizing their interactions. To enhance the interpretability of our approach, we have developed the Synergistic Activation Map (SAM) technique, revealing the critical brain regions and connections that are indicative of MCI/SCD.
arXiv Detail & Related papers (2024-01-02T12:46:02Z)
Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification [13.529183496842819]
We construct a deep learning architecture that takes as input 2D time series of rs-fMRI and 3D volumes T1w. We show that our proposed MFFormer performs better than that using a single modality or multi-modality MRI on schizophrenia and bipolar disorder diagnosis.
arXiv Detail & Related papers (2023-10-04T10:02:04Z)
Cross-Modality Deep Feature Learning for Brain Tumor Segmentation [158.8192041981564]
This paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data. The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale. Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance.
arXiv Detail & Related papers (2022-01-07T07:46:01Z)
Mapping individual differences in cortical architecture using multi-view representation learning [0.0]
We introduce a novel machine learning method which allows combining the activation-and connectivity-based information respectively measured through task-fMRI and resting-state fMRI. It combines a multi-view deep autoencoder which is designed to fuse the two fMRI modalities into a joint representation space within which a predictive model is trained to guess a scalar score that characterizes the patient.
arXiv Detail & Related papers (2020-04-01T09:01:25Z)
Meta-modal Information Flow: A Method for Capturing Multimodal Modular Disconnectivity in Schizophrenia [11.100316178148994]
We introduce a method that takes advantage of multimodal data in addressing the hypotheses of disconnectivity and dysfunction within schizophrenia (SZ) We propose a modularity-based method that can be applied to the GGM to identify links that are associated with mental illness across a multimodal data set. Through simulation and real data, we show our approach reveals important information about disease-related network disruptions that are missed with a focus on a single modality.
arXiv Detail & Related papers (2020-01-06T18:46:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.