OmicsCL: Unsupervised Contrastive Learning for Cancer Subtype Discovery and Survival Stratification
- URL: http://arxiv.org/abs/2505.00650v1
- Date: Thu, 01 May 2025 16:51:48 GMT
- Title: OmicsCL: Unsupervised Contrastive Learning for Cancer Subtype Discovery and Survival Stratification
- Authors: Atahan Karagoz,
- Abstract summary: We introduce OmicsCL, a modular contrastive learning framework that jointly embeds heterogeneous omics modalities.<n> Evaluated on the BRCAA dataset, OmicsCL uncovers clinically meaningful clusters.<n>Results highlight the promise of contrastive objectives for biological insight discovery in high-dimensional, heterogeneous omics data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unsupervised learning of disease subtypes from multi-omics data presents a significant opportunity for advancing personalized medicine. We introduce OmicsCL, a modular contrastive learning framework that jointly embeds heterogeneous omics modalities-such as gene expression, DNA methylation, and miRNA expression-into a unified latent space. Our method incorporates a survival-aware contrastive loss that encourages the model to learn representations aligned with survival-related patterns, without relying on labeled outcomes. Evaluated on the TCGA BRCA dataset, OmicsCL uncovers clinically meaningful clusters and achieves strong unsupervised concordance with patient survival. The framework demonstrates robustness across hyperparameter configurations and can be tuned to prioritize either subtype coherence or survival stratification. Ablation studies confirm that integrating survival-aware loss significantly enhances the predictive power of learned embeddings. These results highlight the promise of contrastive objectives for biological insight discovery in high-dimensional, heterogeneous omics data.
Related papers
- An MLI-Guided Framework for Subgroup-Aware Modeling in Electronic Health Records (AdaptHetero) [0.18416014644193068]
AdaptHetero is a novel MLI-driven framework that transforms interpretability insights into actionable guidance.<n> AdaptHetero consistently uncovers heterogeneous model behaviors in predicting ICU mortality, in-hospital death, and hidden hypoxemia.
arXiv Detail & Related papers (2025-07-28T04:37:03Z) - Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z) - MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention [52.106879463828044]
Histopathology and transcriptomics are fundamental modalities in oncology, encapsulating the morphological and molecular aspects of the disease.<n>We present MIRROR, a novel multi-modal representation learning method designed to foster both modality alignment and retention.<n>Extensive evaluations on TCGA cohorts for cancer subtyping and survival analysis highlight MIRROR's superior performance.
arXiv Detail & Related papers (2025-03-01T07:02:30Z) - HACSurv: A Hierarchical Copula-Based Approach for Survival Analysis with Dependent Competing Risks [51.95824566163554]
We introduce HACSurv, a survival analysis method that learns Hierarchical Archimedean Copulas structures.<n>By capturing the dependencies between risks and censoring, HACSurv improves the accuracy of survival predictions.
arXiv Detail & Related papers (2024-10-19T18:52:18Z) - Survival Prediction in Lung Cancer through Multi-Modal Representation Learning [9.403446155541346]
This paper presents a novel approach to survival prediction by harnessing comprehensive information from CT and PET scans, along with associated Genomic data.
We aim to develop a robust predictive model for survival outcomes by integrating multi-modal imaging data with genetic information.
arXiv Detail & Related papers (2024-09-30T10:42:20Z) - MM-SurvNet: Deep Learning-Based Survival Risk Stratification in Breast
Cancer Through Multimodal Data Fusion [18.395418853966266]
We propose a novel deep learning approach for breast cancer survival risk stratification.
We employ vision transformers, specifically the MaxViT model, for image feature extraction, and self-attention to capture intricate image relationships at the patient level.
A dual cross-attention mechanism fuses these features with genetic data, while clinical data is incorporated at the final layer to enhance predictive accuracy.
arXiv Detail & Related papers (2024-02-19T02:31:36Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - Modelling Technical and Biological Effects in scRNA-seq data with
Scalable GPLVMs [6.708052194104378]
We extend a popular approach for probabilistic non-linear dimensionality reduction, the Gaussian process latent variable model, to scale to massive single-cell datasets.
The key idea is to use an augmented kernel which preserves the factorisability of the lower bound allowing for fast variational inference.
arXiv Detail & Related papers (2022-09-14T15:25:15Z) - SurvLatent ODE : A Neural ODE based time-to-event model with competing
risks for longitudinal data improves cancer-associated Deep Vein Thrombosis
(DVT) prediction [68.8204255655161]
We propose a generative time-to-event model, SurvLatent ODE, which parameterizes a latent representation under irregularly sampled data.
Our model then utilizes the latent representation to flexibly estimate survival times for multiple competing events without specifying shapes of event-specific hazard function.
SurvLatent ODE outperforms the current clinical standard Khorana Risk scores for stratifying DVT risk groups.
arXiv Detail & Related papers (2022-04-20T17:28:08Z) - A Deep Variational Approach to Clustering Survival Data [5.871238645229228]
We introduce a novel probabilistic approach to cluster survival data in a variational deep clustering setting.
Our proposed method employs a deep generative model to uncover the underlying distribution of both the explanatory variables and the potentially censored survival times.
arXiv Detail & Related papers (2021-06-10T14:10:25Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Low-Rank Reorganization via Proportional Hazards Non-negative Matrix
Factorization Unveils Survival Associated Gene Clusters [9.773075235189525]
In this work, Cox proportional hazards regression is integrated with NMF by imposing survival constraints.
Using human cancer gene expression data, the proposed technique can unravel critical clusters of cancer genes.
The discovered gene clusters reflect rich biological implications and can help identify survival-related biomarkers.
arXiv Detail & Related papers (2020-08-09T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.