Multivariate Feature Selection and Autoencoder Embeddings of Ovarian Cancer Clinical and Genetic Data
- URL: http://arxiv.org/abs/2501.15881v1
- Date: Mon, 27 Jan 2025 09:07:07 GMT
- Title: Multivariate Feature Selection and Autoencoder Embeddings of Ovarian Cancer Clinical and Genetic Data
- Authors: Luis Bote-Curiel, Sergio Ruiz-Llorente, Sergio Muñoz-Romero, Mónica Yagüe-Fernández, Arantzazu Barquín, Jesús García-Donas, José Luis Rojo-Álvarez,
- Abstract summary: This study explores a data-driven approach to discovering novel clinical and genetic markers in ovarian cancer (OC)
In the autoencoder analysis, a clearer pattern emerged when using clinical features and the combination of clinical and genetic data.
Key clinical variables (such as type of surgery and neoadjuvant chemotherapy) and certain gene mutations showed strong relevance, along with low-risk genetic factors.
- Score: 2.973561339858947
- License:
- Abstract: This study explores a data-driven approach to discovering novel clinical and genetic markers in ovarian cancer (OC). Two main analyses were performed: (1) a nonlinear examination of an OC dataset using autoencoders, which compress data into a 3-dimensional latent space to detect potential intrinsic separability between platinum-sensitive and platinum-resistant groups; and (2) an adaptation of the informative variable identifier (IVI) to determine which features (clinical or genetic) are most relevant to disease progression. In the autoencoder analysis, a clearer pattern emerged when using clinical features and the combination of clinical and genetic data, indicating that disease progression groups can be distinguished more effectively after supervised fine tuning. For genetic data alone, this separability was less apparent but became more pronounced with a supervised approach. Using the IVI-based feature selection, key clinical variables (such as type of surgery and neoadjuvant chemotherapy) and certain gene mutations showed strong relevance, along with low-risk genetic factors. These findings highlight the strength of combining machine learning tools (autoencoders) with feature selection methods (IVI) to gain insights into ovarian cancer progression. They also underscore the potential for identifying new biomarkers that integrate clinical and genomic indicators, ultimately contributing to improved patient stratification and personalized treatment strategies.
Related papers
- Precision Cancer Classification and Biomarker Identification from mRNA Gene Expression via Dimensionality Reduction and Explainable AI [0.9423257767158634]
This research presents a comprehensive pipeline designed to accurately identify 33 distinct cancer types and their corresponding gene sets.
It incorporates a combination of normalization and feature selection techniques to reduce dataset dimensionality effectively.
We leverage Explainable AI to elucidate the biological significance of the identified cancer-specific genes.
arXiv Detail & Related papers (2024-10-08T18:56:31Z) - Advancing Gene Selection in Oncology: A Fusion of Deep Learning and
Sparsity for Precision Gene Selection [4.093503153499691]
This paper introduces two gene selection strategies for deep learning-based survival prediction models.
The first strategy uses a sparsity-inducing method while the second one uses importance based gene selection for identifying relevant genes.
arXiv Detail & Related papers (2024-03-04T10:44:57Z) - Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions [3.5489676012585236]
We introduce the Bridge model to derive integrated features to preserve information beyond common genes.
The model consistently excels in predicting patient survival across six cancer types in GENIE BPC data.
arXiv Detail & Related papers (2024-01-30T23:25:05Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - A New Deep Learning and XAI-Based Algorithm for Features Selection in
Genomics [5.787117733071415]
The paper proposes a novel algorithm to perform Feature Selection on genomic-scale data.
Results of the application on a Chronic Lymphocytic Leukemia dataset evidence the effectiveness of the algorithm.
arXiv Detail & Related papers (2023-03-29T16:44:13Z) - Comprehensive and user-analytics-friendly cancer patient database for
physicians and researchers [0.18472148461613155]
A relational database has been developed integrating status of cancer-critical gene mutations, serum galectin profiles, serum and tumor glycomic profiles.
Our project provides a framework for an integrated, interactive, and growing database to analyze molecular and clinical patterns across cancer stages and subtypes.
arXiv Detail & Related papers (2023-02-01T20:10:06Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.