SurvAgent: Hierarchical CoT-Enhanced Case Banking and Dichotomy-Based Multi-Agent System for Multimodal Survival Prediction
- URL: http://arxiv.org/abs/2511.16635v1
- Date: Thu, 20 Nov 2025 18:41:44 GMT
- Title: SurvAgent: Hierarchical CoT-Enhanced Case Banking and Dichotomy-Based Multi-Agent System for Multimodal Survival Prediction
- Authors: Guolin Huang, Wenting Chen, Jiaqi Yang, Xinheng Lyu, Xiaoling Luo, Sen Yang, Xiaohan Xing, Linlin Shen,
- Abstract summary: We introduce SurvAgent, the first hierarchical chain-of-thought (CoT)-enhanced multi-agent system for multimodal survival prediction.<n>SurvAgent consists of two stages: WSI-Gene CoT-Enhanced Case Bank Construction employs hierarchical analysis through Low-Magnification Screening, Cross-Modal Similarity-Aware Patch Mining, and Confidence-Aware Patch Mining for pathology images.<n>Dichotomy-Based Multi-Expert Agent Inference retrieves similar cases via RAG and integrates multimodal reports with expert predictions through progressive interval refinement.
- Score: 49.355973075150075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Survival analysis is critical for cancer prognosis and treatment planning, yet existing methods lack the transparency essential for clinical adoption. While recent pathology agents have demonstrated explainability in diagnostic tasks, they face three limitations for survival prediction: inability to integrate multimodal data, ineffective region-of-interest exploration, and failure to leverage experiential learning from historical cases. We introduce SurvAgent, the first hierarchical chain-of-thought (CoT)-enhanced multi-agent system for multimodal survival prediction. SurvAgent consists of two stages: (1) WSI-Gene CoT-Enhanced Case Bank Construction employs hierarchical analysis through Low-Magnification Screening, Cross-Modal Similarity-Aware Patch Mining, and Confidence-Aware Patch Mining for pathology images, while Gene-Stratified analysis processes six functional gene categories. Both generate structured reports with CoT reasoning, storing complete analytical processes for experiential learning. (2) Dichotomy-Based Multi-Expert Agent Inference retrieves similar cases via RAG and integrates multimodal reports with expert predictions through progressive interval refinement. Extensive experiments on five TCGA cohorts demonstrate SurvAgent's superority over conventional methods, proprietary MLLMs, and medical agents, establishing a new paradigm for explainable AI-driven survival prediction in precision oncology.
Related papers
- MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning [53.37068897861388]
MedSAM-Agent is a framework that reformulates interactive segmentation as a multi-step autonomous decision-making process.<n>We develop a two-stage training pipeline that integrates multi-turn, end-to-end outcome verification.<n>Experiments across 6 medical modalities and 21 datasets demonstrate that MedSAM-Agent achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-02-03T09:47:49Z) - Post-TIPS Prediction via Multimodal Interaction: A Multi-Center Dataset and Framework for Survival, Complication, and Portal Pressure Assessment [23.819390753301136]
Transjugular intrahepatic portosystemic shunt (TIPS) is an established procedure for portal hypertension, but provides variable survival and frequent overt hepatic encephalopathy (OHE)<n>Current studies typically build machine learning models from preoperative CT images or clinical characteristics.<n>We present MultiTIPS, the first public multi-center dataset for TIPS prognosis, and propose a novel multimodal prognostic framework based on it.
arXiv Detail & Related papers (2025-10-12T06:11:59Z) - MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation [55.37355146924576]
MedSeqFT is a sequential fine-tuning framework for medical image analysis.<n>It adapts pre-trained models to new tasks while refining their representational capacity.<n>It consistently outperforms state-of-the-art fine-tuning strategies.
arXiv Detail & Related papers (2025-09-07T15:22:53Z) - impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction [75.43342771863837]
We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy.<n>It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches.<n>Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets.
arXiv Detail & Related papers (2025-08-08T10:01:16Z) - Deep Survival Analysis in Multimodal Medical Data: A Parametric and Probabilistic Approach with Competing Risks [47.19194118883552]
We introduce a multimodal deep learning framework for survival analysis capable of modeling both single and competing risks scenarios.<n>We propose SAMVAE (Survival Analysis Multimodal Variational Autoencoder), a novel deep learning architecture designed for survival prediction.
arXiv Detail & Related papers (2025-07-10T14:29:48Z) - MurreNet: Modeling Holistic Multimodal Interactions Between Histopathology and Genomic Profiles for Survival Prediction [5.895727565919295]
This paper presents a Multimodal Representation Decoupling Network (MurreNet) to advance cancer survival analysis.<n>MurreNet decomposes paired input data into modality-specific and modality-shared representations, thereby reducing redundancy between modalities.<n>Experiments conducted on six TCGA cancer cohorts demonstrate that our MurreNet achieves state-of-the-art (SOTA) performance in survival prediction.
arXiv Detail & Related papers (2025-07-07T11:26:29Z) - EAGLE: Efficient Alignment of Generalized Latent Embeddings for Multimodal Survival Prediction with Interpretable Attribution Analysis [16.567468717846676]
Existing multimodal approaches suffer from simplistic fusion strategies, massive computational requirements, and lack of interpretability-critical barriers to clinical adoption.<n>We present Eagle, a novel deep learning framework that addresses these limitations through attention-based multimodal fusion with comprehensive attribution analysis.<n>Eagle bridges the gap between advanced AI capabilities and practical healthcare deployment, offering a scalable solution for multimodal survival prediction.
arXiv Detail & Related papers (2025-06-12T03:56:13Z) - Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - TTMFN: Two-stream Transformer-based Multimodal Fusion Network for
Survival Prediction [7.646155781863875]
We propose a novel framework named Two-stream Transformer-based Multimodal Fusion Network for survival prediction (TTMFN)
In TTMFN, we present a two-stream multimodal co-attention transformer module to take full advantage of the complex relationships between different modalities.
The experiment results on four datasets from The Cancer Genome Atlas demonstrate that TTMFN can achieve the best performance or competitive results.
arXiv Detail & Related papers (2023-11-13T02:31:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.