ME-Mamba: Multi-Expert Mamba with Efficient Knowledge Capture and Fusion for Multimodal Survival Analysis
- URL: http://arxiv.org/abs/2509.16900v1
- Date: Sun, 21 Sep 2025 03:23:04 GMT
- Title: ME-Mamba: Multi-Expert Mamba with Efficient Knowledge Capture and Fusion for Multimodal Survival Analysis
- Authors: Chengsheng Zhang, Linhao Qu, Xiaoyu Liu, Zhijian Song,
- Abstract summary: multimodal survival analysis integrating pathology images and genomics data has emerged as a promising approach.<n>We propose a Multi-Expert Mamba system that captures discriminative pathological and genomics features.<n>Our method achieves stable and accurate survival analysis with relatively low computational complexity.
- Score: 18.458319783609163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Survival analysis using whole-slide images (WSIs) is crucial in cancer research. Despite significant successes, pathology images typically only provide slide-level labels, which hinders the learning of discriminative representations from gigapixel WSIs. With the rapid advancement of high-throughput sequencing technologies, multimodal survival analysis integrating pathology images and genomics data has emerged as a promising approach. We propose a Multi-Expert Mamba (ME-Mamba) system that captures discriminative pathological and genomic features while enabling efficient integration of both modalities. This approach achieves complementary information fusion without losing critical information from individual modalities, thereby facilitating accurate cancer survival analysis. Specifically, we first introduce a Pathology Expert and a Genomics Expert to process unimodal data separately. Both experts are designed with Mamba architectures that incorporate conventional scanning and attention-based scanning mechanisms, allowing them to extract discriminative features from long instance sequences containing substantial redundant or irrelevant information. Second, we design a Synergistic Expert responsible for modality fusion. It explicitly learns token-level local correspondences between the two modalities via Optimal Transport, and implicitly enhances distribution consistency through a global cross-modal fusion loss based on Maximum Mean Discrepancy. The fused feature representations are then passed to a mamba backbone for further integration. Through the collaboration of the Pathology Expert, Genomics Expert, and Synergistic Expert, our method achieves stable and accurate survival analysis with relatively low computational complexity. Extensive experimental results on five datasets in The Cancer Genome Atlas (TCGA) demonstrate our state-of-the-art performance.
Related papers
- SurvAgent: Hierarchical CoT-Enhanced Case Banking and Dichotomy-Based Multi-Agent System for Multimodal Survival Prediction [49.355973075150075]
We introduce SurvAgent, the first hierarchical chain-of-thought (CoT)-enhanced multi-agent system for multimodal survival prediction.<n>SurvAgent consists of two stages: WSI-Gene CoT-Enhanced Case Bank Construction employs hierarchical analysis through Low-Magnification Screening, Cross-Modal Similarity-Aware Patch Mining, and Confidence-Aware Patch Mining for pathology images.<n>Dichotomy-Based Multi-Expert Agent Inference retrieves similar cases via RAG and integrates multimodal reports with expert predictions through progressive interval refinement.
arXiv Detail & Related papers (2025-11-20T18:41:44Z) - impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction [75.43342771863837]
We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy.<n>It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches.<n>Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets.
arXiv Detail & Related papers (2025-08-08T10:01:16Z) - Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder [18.519138120118125]
We propose a Conditional Latent Differentiation Variational AutoEncoder (LD-CVAE) for robust multimodal survival prediction.<n>Specifically, a Variational Information Bottleneck Transformer (VIB-Trans) module is proposed to learn compressed pathological representations from the gigapixel WSIs.<n>We develop a novel Latent Differentiation Variational AutoEncoder (LD-VAE) to learn the common and specific posteriors for the genomic embeddings with diverse functions.
arXiv Detail & Related papers (2025-03-12T15:58:37Z) - Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology [6.418265127069878]
We propose the use of omic embeddings during early and late fusion to capture complementary information from local (patch-level) to global (slide-level) interactions.<n>This dual fusion strategy enhances interpretability and classification performance, highlighting its potential for clinical diagnostics.
arXiv Detail & Related papers (2024-11-26T13:25:53Z) - Personalized 2D Binary Patient Codes of Tissue Images and Immunogenomic Data Through Multimodal Self-Supervised Fusion [0.9374652839580183]
MarbliX is an innovative framework that integrates histopathology images with immunogenomic sequencing data, encapsulating them into a concise binary patient code.
The experimental results demonstrate the potential of MarbliX to empower healthcare professionals with in-depth insights.
arXiv Detail & Related papers (2024-09-19T22:49:27Z) - Multimodal Cross-Task Interaction for Survival Analysis in Whole Slide Pathological Images [10.996711454572331]
Survival prediction, utilizing pathological images and genomic profiles, is increasingly important in cancer analysis and prognosis.
Existing multimodal methods often rely on alignment strategies to integrate complementary information.
We propose a Multimodal Cross-Task Interaction (MCTI) framework to explore the intrinsic correlations between subtype classification and survival analysis tasks.
arXiv Detail & Related papers (2024-06-25T02:18:35Z) - MoME: Mixture of Multimodal Experts for Cancer Survival Prediction [46.520971457396726]
Survival analysis, as a challenging task, requires integrating Whole Slide Images (WSIs) and genomic data for comprehensive decision-making.
Previous approaches utilize co-attention methods, which fuse features from both modalities only once after separate encoding.
We propose a Biased Progressive Clever (BPE) paradigm, performing encoding and fusion simultaneously.
arXiv Detail & Related papers (2024-06-14T03:44:33Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Multimodal Optimal Transport-based Co-Attention Transformer with Global
Structure Consistency for Survival Prediction [5.445390550440809]
Survival prediction is a complicated ordinal regression task that aims to predict the ranking risk of death.
Due to the large size of pathological images, it is difficult to effectively represent the gigapixel whole slide images (WSIs)
Interactions within tumor microenvironment (TME) in histology are essential for survival analysis.
arXiv Detail & Related papers (2023-06-14T08:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.