Unsupervised risk factor identification across cancer types and data modalities via explainable artificial intelligence
- URL: http://arxiv.org/abs/2506.12944v3
- Date: Tue, 29 Jul 2025 10:40:05 GMT
- Title: Unsupervised risk factor identification across cancer types and data modalities via explainable artificial intelligence
- Authors: Maximilian Ferle, Jonas Ader, Thomas Wiemers, Nora Grieb, Adrian Lindenmeyer, Hans-Jonas Meyer, Thomas Neumuth, Markus Kreuz, Kristin Reiche, Maximilian Merz,
- Abstract summary: We present a novel method for unsupervised machine learning that directly optimize for survival heterogeneity across patient clusters.<n>Our approach represents novel methodology for training any neural network architecture on any data modality to identify prognostically distinct patient groups.<n>This pan-cancer, model-agnostic approach represents a valuable advancement in clinical risk stratification.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Risk stratification is a key tool in clinical decision-making, yet current approaches often fail to translate sophisticated survival analysis into actionable clinical criteria. We present a novel method for unsupervised machine learning that directly optimizes for survival heterogeneity across patient clusters through a differentiable adaptation of the multivariate logrank statistic. Unlike most existing methods that rely on proxy metrics, our approach represents novel methodology for training any neural network architecture on any data modality to identify prognostically distinct patient groups. We thoroughly evaluate the method in simulation experiments and demonstrate its utility in practice by applying it to two distinct cancer types: analyzing laboratory parameters from multiple myeloma patients and computed tomography images from non-small cell lung cancer patients, identifying prognostically distinct patient subgroups with significantly different survival outcomes in both cases. Post-hoc explainability analyses uncover clinically meaningful features determining the group assignments which align well with established risk factors and thus lend strong weight to the methods utility. This pan-cancer, model-agnostic approach represents a valuable advancement in clinical risk stratification, enabling the discovery of novel prognostic signatures across diverse data types while providing interpretable results that promise to complement treatment personalization and clinical decision-making in oncology and beyond.
Related papers
- Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z) - Adaptive Deep Learning for Multiclass Breast Cancer Classification via Misprediction Risk Analysis [0.8028869343053783]
Early detection is crucial for improving patient outcomes.<n>Computer-aided diagnostic approaches have significantly enhanced breast cancer detection.<n>However, these methods face challenges in multiclass classification, leading to frequent mispredictions.
arXiv Detail & Related papers (2025-03-17T03:25:28Z) - TRACE: Transformer-based Risk Assessment for Clinical Evaluation [2.2231319591004435]
TRACE (Transformer-based Risk Assessment for Clinical Evaluation) is a novel method for clinical risk assessment based on clinical data.<n>Our approach is able to handle different data modalities, including continuous, categorical and multiple-choice (checkbox) attributes.<n>In terms of explainability, our Transformer-based method offers easily interpretable results via attention weights.
arXiv Detail & Related papers (2024-11-13T15:42:28Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - Cross-modality Attention-based Multimodal Fusion for Non-small Cell Lung
Cancer (NSCLC) Patient Survival Prediction [0.6476298550949928]
We propose a cross-modality attention-based multimodal fusion pipeline designed to integrate modality-specific knowledge for patient survival prediction in non-small cell lung cancer (NSCLC)
Compared with single modality, which achieved c-index of 0.5772 and 0.5885 using solely tissue image data or RNA-seq data, respectively, the proposed fusion approach achieved c-index 0.6587 in our experiment.
arXiv Detail & Related papers (2023-08-18T21:42:52Z) - An explainable model to support the decision about the therapy protocol
for AML [1.290382979353427]
This paper presents the data analysis and an explainable machine-learning model to support the decision about the most appropriate therapy protocol.
The results indicate that it is possible to use it to support the specialists' decisions safely.
arXiv Detail & Related papers (2023-07-05T20:04:13Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Divide-and-Rule: Self-Supervised Learning for Survival Analysis in
Colorectal Cancer [9.431791041887957]
We propose a self-supervised learning method that learns a representation of tissue regions as well as a metric of the clustering to obtain their underlying patterns.
We show that the proposed approach can benefit from linear predictors to avoid overfitting in patient outcomes predictions.
arXiv Detail & Related papers (2020-07-07T09:15:36Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.