Unified Cross-Modal Attention-Mixer Based Structural-Functional Connectomics Fusion for Neuropsychiatric Disorder Diagnosis
- URL: http://arxiv.org/abs/2505.15139v1
- Date: Wed, 21 May 2025 05:49:13 GMT
- Title: Unified Cross-Modal Attention-Mixer Based Structural-Functional Connectomics Fusion for Neuropsychiatric Disorder Diagnosis
- Authors: Badhan Mazumder, Lei Wu, Vince D. Calhoun, Dong Hye Ye,
- Abstract summary: ConneX is a multimodal fusion method that integrates cross-attention mechanism and multilayer perceptron (MLP)-Mixer for refined feature fusion.<n>We show improved performance on two distinct clinical datasets, highlighting the robustness of our proposed framework.
- Score: 17.40353435750778
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gaining insights into the structural and functional mechanisms of the brain has been a longstanding focus in neuroscience research, particularly in the context of understanding and treating neuropsychiatric disorders such as Schizophrenia (SZ). Nevertheless, most of the traditional multimodal deep learning approaches fail to fully leverage the complementary characteristics of structural and functional connectomics data to enhance diagnostic performance. To address this issue, we proposed ConneX, a multimodal fusion method that integrates cross-attention mechanism and multilayer perceptron (MLP)-Mixer for refined feature fusion. Modality-specific backbone graph neural networks (GNNs) were firstly employed to obtain feature representation for each modality. A unified cross-modal attention network was then introduced to fuse these embeddings by capturing intra- and inter-modal interactions, while MLP-Mixer layers refined global and local features, leveraging higher-order dependencies for end-to-end classification with a multi-head joint loss. Extensive evaluations demonstrated improved performance on two distinct clinical datasets, highlighting the robustness of our proposed framework.
Related papers
- NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding [51.63264715941068]
textbfNEARL-CLIP (iunderlineNteracted quunderlineEry underlineAdaptation with ounderlineRthogonaunderlineL Regularization) is a novel cross-modality interaction VLM-based framework.
arXiv Detail & Related papers (2025-08-06T05:44:01Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [52.106879463828044]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - Integrated Brain Connectivity Analysis with fMRI, DTI, and sMRI Powered by Interpretable Graph Neural Networks [17.063133885403154]
We integrate functional magnetic resonance imaging, diffusion tensor imaging, and structural MRI into a cohesive framework.<n>Our approach incorporates a masking strategy to differentially weight neural connections, thereby facilitating a holistic amalgamation of multimodal imaging data.<n>The model is applied to the Human Connectome Project's Development study to elucidate the associations between multimodal imaging and cognitive functions throughout youth.
arXiv Detail & Related papers (2024-08-26T13:16:42Z) - Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks [8.703026708558157]
A novel graph attention network based on sparse interaction mechanism (Multi- SIGATnet) was proposed forSchizophrenia classification.
The effectiveness of the model is verified on the Center for Biomedical Research Excellence (COBRE) and University of California Los Angeles (UCLA) datasets.
arXiv Detail & Related papers (2024-08-25T13:15:55Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis [46.58592655409785]
We propose a novel Cross-Attentive Multi-modal Fusion framework (CAMF) to capture both intra-modal and inter-modal relationships between fMRI and sMRI.
Our approach significantly improves classification accuracy, as demonstrated by our evaluations on two extensive multi-modal brain imaging datasets.
The gradient-guided Score-CAM is applied to interpret critical functional networks and brain regions involved in schizophrenia.
arXiv Detail & Related papers (2024-03-29T20:32:30Z) - HA-HI: Synergising fMRI and DTI through Hierarchical Alignments and
Hierarchical Interactions for Mild Cognitive Impairment Diagnosis [10.028997265879598]
We introduce a novel Hierarchical Alignments and Hierarchical Interactions (HA-HI) method for diagnosis of mild cognitive impairment (MCI) and subjective cognitive decline (SCD)
HA-HI efficiently learns significant MCI- or SCD- related regional and connectivity features by aligning various feature types and hierarchically maximizing their interactions.
To enhance the interpretability of our approach, we have developed the Synergistic Activation Map (SAM) technique, revealing the critical brain regions and connections that are indicative of MCI/SCD.
arXiv Detail & Related papers (2024-01-02T12:46:02Z) - Fusing Structural and Functional Connectivities using Disentangled VAE
for Detecting MCI [9.916963496386089]
A novel hierarchical structural-functional connectivity fusing (HSCF) model is proposed to construct brain structural-functional connectivity matrices.
Results from a wide range of tests performed on the public Alzheimer's Disease Neuroimaging Initiative database show that the proposed model performs better than competing approaches.
arXiv Detail & Related papers (2023-06-16T05:22:25Z) - Hierarchical Graph Convolutional Network Built by Multiscale Atlases for
Brain Disorder Diagnosis Using Functional Connectivity [48.75665245214903]
We propose a novel framework to perform multiscale FCN analysis for brain disorder diagnosis.
We first use a set of well-defined multiscale atlases to compute multiscale FCNs.
Then, we utilize biologically meaningful brain hierarchical relationships among the regions in multiscale atlases to perform nodal pooling.
arXiv Detail & Related papers (2022-09-22T04:17:57Z) - MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
Architecture for Medical Image Analysis [71.2022403915147]
We introduce MEDUSA, a multi-scale encoder-decoder self-attention mechanism tailored for medical image analysis.
We obtain state-of-the-art performance on challenging medical image analysis benchmarks including COVIDx, RSNA RICORD, and RSNA Pneumonia Challenge.
arXiv Detail & Related papers (2021-10-12T15:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.