Transforming Multi-Omics Integration with GANs: Applications in Alzheimer's and Cancer
- URL: http://arxiv.org/abs/2510.19870v1
- Date: Wed, 22 Oct 2025 05:55:49 GMT
- Title: Transforming Multi-Omics Integration with GANs: Applications in Alzheimer's and Cancer
- Authors: Md Selim Reza, Sabrin Afroz, Mostafizer Rahman, Md Ashad Alam,
- Abstract summary: We introduce Omics-GAN, a Generative Adversarial Network (GAN)-based framework designed to generate high-quality synthetic multi-omics profiles.<n>We demonstrated Omics-GAN on three omics types (mRNA, methylation and DNA methylation) using the ROSMAP cohort for Alzheimer's disease.<n>A support vector machine (SVM) with repeated 5-fold cross-validation improved prediction accuracy compared to original omics profiles.<n>Boxplot analyses confirmed that synthetic data preserved statistical distributions while reducing noise and outliers.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-omics data integration is crucial for understanding complex diseases, yet limited sample sizes, noise, and heterogeneity often reduce predictive power. To address these challenges, we introduce Omics-GAN, a Generative Adversarial Network (GAN)-based framework designed to generate high-quality synthetic multi-omics profiles while preserving biological relationships. We evaluated Omics-GAN on three omics types (mRNA, miRNA, and DNA methylation) using the ROSMAP cohort for Alzheimer's disease (AD) and TCGA datasets for colon and liver cancer. A support vector machine (SVM) classifier with repeated 5-fold cross-validation demonstrated that synthetic datasets consistently improved prediction accuracy compared to original omics profiles. The AUC of SVM for mRNA improved from 0.72 to 0.74 in AD, and from 0.68 to 0.72 in liver cancer. Synthetic miRNA enhanced classification in colon cancer from 0.59 to 0.69, while synthetic methylation data improved performance in liver cancer from 0.64 to 0.71. Boxplot analyses confirmed that synthetic data preserved statistical distributions while reducing noise and outliers. Feature selection identified significant genes overlapping with original datasets and revealed additional candidates validated by GO and KEGG enrichment analyses. Finally, molecular docking highlighted potential drug repurposing candidates, including Nilotinib for AD, Atovaquone for liver cancer, and Tecovirimat for colon cancer. Omics-GAN enhances disease prediction, preserves biological fidelity, and accelerates biomarker and drug discovery, offering a scalable strategy for precision medicine applications.
Related papers
- Transcriptome-Conditioned Personalized De Novo Drug Generation for AML Using Metaheuristic Assembly and Target-Driven Filtering [0.0]
Acute Myeloid Leukemia (AML) remains a clinical challenge due to its extreme molecular heterogeneity and high relapse rates.<n>This paper presents a novel, end-to-end computational framework that bridges the gap between patient-specific transcriptomics and de novo drug discovery.
arXiv Detail & Related papers (2025-12-24T17:39:37Z) - Interpretable Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification using Multi-Omics Data [36.92842246372894]
Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN) is a deep learning framework that utilizes messenger-RNA, micro-RNA sequences, and DNA methylation samples.<n>By integrating multi-omics data with graph-based deep learning, our proposed approach demonstrates robust predictive performance and interpretability.
arXiv Detail & Related papers (2025-03-29T02:14:05Z) - Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia [14.27396467108753]
Cancer cachexia is a multifactorial syndrome characterized by progressive muscle wasting, metabolic dysfunction, and systemic inflammation.<n>There is no single definitive biomarker for cachexia.<n>This study proposes a multimodal AI-based biomarker for early cancer cachexia detection.
arXiv Detail & Related papers (2025-03-09T22:32:37Z) - scMamba: A Pre-Trained Model for Single-Nucleus RNA Sequencing Analysis in Neurodegenerative Disorders [43.24785083027205]
scMamba is a pre-trained model designed to improve the quality and utility of snRNA-seq analysis.<n>Inspired by the recent Mamba model, scMamba introduces a novel architecture that incorporates a linear adapter layer, gene embeddings, and bidirectional Mamba blocks.<n>We demonstrate that scMamba outperforms benchmark methods in various downstream tasks, including cell type annotation, doublet detection, imputation, and the identification of differentially expressed genes.
arXiv Detail & Related papers (2025-02-12T11:48:22Z) - Synthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study [43.28613210217385]
We employ and compare three state-of-the-art generative models to generate PCG data.<n>Our results demonstrate that the generated PCG data closely resembles the original datasets.<n>In our future work, we plan to incorporate this method into a data augmentation pipeline to synthesize abnormal PCG signals with heart murmurs.
arXiv Detail & Related papers (2024-12-17T18:07:40Z) - LASSO-MOGAT: A Multi-Omics Graph Attention Framework for Cancer Classification [41.94295877935867]
This paper introduces LASSO-MOGAT, a graph-based deep learning framework that integrates messenger RNA, microRNA, and DNA methylation data to classify 31 cancer types.
arXiv Detail & Related papers (2024-08-30T16:26:04Z) - BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion [16.83901927767791]
We present BioFusionNet, a deep learning framework that fuses image-derived features with genetic and clinical data to obtain a holistic profile.
Our model achieves a mean concordance index of 0.77 and a time-dependent area under the curve of 0.84, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2024-02-16T14:19:33Z) - Pathology-and-genomics Multimodal Transformer for Survival Outcome
Prediction [43.1748594898772]
We propose a multimodal transformer (PathOmics) integrating pathology and genomics insights into colon-related cancer survival prediction.
We emphasize the unsupervised pretraining to capture the intrinsic interaction between tissue microenvironments in gigapixel whole slide images.
We evaluate our approach on both TCGA colon and rectum cancer cohorts, showing that the proposed approach is competitive and outperforms state-of-the-art studies.
arXiv Detail & Related papers (2023-07-22T00:59:26Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - A robust kernel machine regression towards biomarker selection in
multi-omics datasets of osteoporosis for drug discovery [2.2897244874280043]
We propose "robust kernel machine regression (RobMR)," to improve the robustness of statistical machine regression and the diversity of fictional data.
Experiments demonstrate that the proposed approach effectively identifies the inter-related risk factors of osteoporosis.
The proposed approach can be applied be to any disease model multi-omics datasets are available.
arXiv Detail & Related papers (2022-01-13T16:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.