Enhancing WSI-Based Survival Analysis with Report-Auxiliary Self-Distillation
- URL: http://arxiv.org/abs/2509.15608v1
- Date: Fri, 19 Sep 2025 05:14:19 GMT
- Title: Enhancing WSI-Based Survival Analysis with Report-Auxiliary Self-Distillation
- Authors: Zheng Wang, Hong Liu, Zheng Wang, Danyi Li, Min Cen, Baptiste Magnier, Li Liang, Liansheng Wang,
- Abstract summary: This paper proposes a novel Report-auxiliary self-distillation (Rasa) framework for WSI-based survival analysis.<n> advanced large language models (LLMs) are utilized to extract fine-grained, WSI-relevant textual descriptions from pathology reports.<n>Next, a self-distillation-based pipeline is designed to filter out irrelevant or redundant WSI features for the student model.
- Score: 26.607553380775908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Survival analysis based on Whole Slide Images (WSIs) is crucial for evaluating cancer prognosis, as they offer detailed microscopic information essential for predicting patient outcomes. However, traditional WSI-based survival analysis usually faces noisy features and limited data accessibility, hindering their ability to capture critical prognostic features effectively. Although pathology reports provide rich patient-specific information that could assist analysis, their potential to enhance WSI-based survival analysis remains largely unexplored. To this end, this paper proposes a novel Report-auxiliary self-distillation (Rasa) framework for WSI-based survival analysis. First, advanced large language models (LLMs) are utilized to extract fine-grained, WSI-relevant textual descriptions from original noisy pathology reports via a carefully designed task prompt. Next, a self-distillation-based pipeline is designed to filter out irrelevant or redundant WSI features for the student model under the guidance of the teacher model's textual knowledge. Finally, a risk-aware mix-up strategy is incorporated during the training of the student model to enhance both the quantity and diversity of the training data. Extensive experiments carried out on our collected data (CRC) and public data (TCGA-BRCA) demonstrate the superior effectiveness of Rasa against state-of-the-art methods. Our code is available at https://github.com/zhengwang9/Rasa.
Related papers
- Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - Methodology for Comparing Machine Learning Algorithms for Survival Analysis [55.65997641180011]
Six machine learning models for survival analysis were evaluated.<n>XGB-AFT achieved the best performance (C-Index = 0.7618; IPCW = 0.7532, followed by GBSA and RSF)
arXiv Detail & Related papers (2025-10-28T14:42:28Z) - Analytical Survey of Learning with Low-Resource Data: From Analysis to Investigation [192.53529928861818]
Learning with high-resource data has demonstrated substantial success in artificial intelligence (AI)<n>However, the costs associated with data annotation and model training remain significant.<n>This survey employs active sampling theory to analyze the generalization error and label complexity associated with learning from low-resource data.
arXiv Detail & Related papers (2025-10-10T03:15:42Z) - PISA: An AI Pipeline for Interpretable-by-design Survival Analysis Providing Multiple Complexity-Accuracy Trade-off Models [0.9851812512860351]
Survival analysis is central to clinical research, informing patient prognoses, guiding treatment decisions, and optimising resource allocation.<n>For these models to be relevant in healthcare, predictions must be traceable to patient-specific characteristics.<n>Traditional survival models often fail to capture non-linear interactions, while modern deep learning approaches are limited by poor interpretability.<n>We propose a Pipeline for Interpretable Survival Analysis (PISA) - a pipeline that provides multiple survival analysis models that trade off complexity and performance.
arXiv Detail & Related papers (2025-09-13T18:09:14Z) - SigBERT: Combining Narrative Medical Reports and Rough Path Signature Theory for Survival Risk Estimation in Oncology [1.5425688173297465]
SigBERT is an innovative temporal survival analysis framework designed to process a large number of clinical reports per patient.<n>It processes timestamped medical reports by extracting and averaging word embeddings into sentence embeddings.<n>It was trained and evaluated on a real-world oncology dataset from the L'eon B'erard Center corpus, with a C-index score of 0.75 (sd 0.014) on the independent test cohort.
arXiv Detail & Related papers (2025-07-25T12:33:25Z) - Deep Survival Analysis in Multimodal Medical Data: A Parametric and Probabilistic Approach with Competing Risks [47.19194118883552]
We introduce a multimodal deep learning framework for survival analysis capable of modeling both single and competing risks scenarios.<n>We propose SAMVAE (Survival Analysis Multimodal Variational Autoencoder), a novel deep learning architecture designed for survival prediction.
arXiv Detail & Related papers (2025-07-10T14:29:48Z) - Graph-Convolutional-Beta-VAE for Synthetic Abdominal Aorta Aneurysm Generation [4.363232795241618]
This study presents a beta-Variational Autoencoder Graph Convolutional Neural Network framework for generating synthetic Abdominal Aorta Aneurysms (AAA)<n>Our approach extracts key anatomical features and captures complex statistical relationships within a compact disentangled latent space.<n>The resulting synthetic AAA dataset preserves patient privacy while providing a scalable foundation for medical research, device testing, and computational modeling.
arXiv Detail & Related papers (2025-06-16T15:55:56Z) - Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation [1.1497371646067622]
Whole Slide Images (WSIs) play a crucial role in accurate cancer diagnosis and prognosis.<n>Given that WSIs are gigapixels in size, they present difficulties in terms of storage, processing, and model training.<n>We introduce ADaFGrad, a method designed to enhance lifelong learning for whole-slide image (WSI) analysis.
arXiv Detail & Related papers (2025-05-04T04:46:08Z) - MIL vs. Aggregation: Evaluating Patient-Level Survival Prediction Strategies Using Graph-Based Learning [52.231128973251124]
We compare various strategies for predicting survival at the WSI and patient level.<n>The former treats each WSI as an independent sample, mimicking the strategy adopted in other works.<n>The latter comprises methods to either aggregate the predictions of the several WSIs or automatically identify the most relevant slide.
arXiv Detail & Related papers (2025-03-29T11:14:02Z) - Self-Explaining Hypergraph Neural Networks for Diagnosis Prediction [45.89562183034469]
Existing deep learning diagnosis prediction models with intrinsic interpretability often assign attention weights to every past diagnosis or hospital visit.<n>We introduce SHy, a self-explaining hypergraph neural network model, designed to offer personalized, concise and faithful explanations.<n> SHy captures higher-order disease interactions and extracts distinct temporal phenotypes as personalized explanations.
arXiv Detail & Related papers (2025-02-15T06:33:02Z) - Interpretable Survival Analysis for Heart Failure Risk Prediction [50.64739292687567]
We propose a novel survival analysis pipeline that is both interpretable and competitive with state-of-the-art survival models.
Our pipeline achieves state-of-the-art performance and provides interesting and novel insights about risk factors for heart failure.
arXiv Detail & Related papers (2023-10-24T02:56:05Z) - Explainable Censored Learning: Finding Critical Features with Long Term
Prognostic Values for Survival Prediction [28.943631598055926]
We introduce a novel, easily deployable approach, called EXplainable CEnsored Learning (EXCEL), to iteratively exploit critical variables.
We show that EXCEL can effectively identify critical features and achieve performance on par with or better than the original models.
arXiv Detail & Related papers (2022-09-30T12:56:29Z) - FedPseudo: Pseudo value-based Deep Learning Models for Federated
Survival Analysis [9.659041001051415]
We propose a first-of-its-kind, pseudo value-based deep learning model for federated survival analysis called FedPseudo.
Our proposed FL framework achieves similar performance as the best centrally trained deep survival analysis model.
arXiv Detail & Related papers (2022-07-12T01:10:36Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.