Advancing Gene Selection in Oncology: A Fusion of Deep Learning and
Sparsity for Precision Gene Selection
- URL: http://arxiv.org/abs/2403.01927v1
- Date: Mon, 4 Mar 2024 10:44:57 GMT
- Title: Advancing Gene Selection in Oncology: A Fusion of Deep Learning and
Sparsity for Precision Gene Selection
- Authors: Akhila Krishna, Ravi Kant Gupta, Pranav Jeevan, Amit Sethi
- Abstract summary: This paper introduces two gene selection strategies for deep learning-based survival prediction models.
The first strategy uses a sparsity-inducing method while the second one uses importance based gene selection for identifying relevant genes.
- Score: 4.093503153499691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gene selection plays a pivotal role in oncology research for improving
outcome prediction accuracy and facilitating cost-effective genomic profiling
for cancer patients. This paper introduces two gene selection strategies for
deep learning-based survival prediction models. The first strategy uses a
sparsity-inducing method while the second one uses importance based gene
selection for identifying relevant genes. Our overall approach leverages the
power of deep learning to model complex biological data structures, while
sparsity-inducing methods ensure the selection process focuses on the most
informative genes, minimizing noise and redundancy. Through comprehensive
experimentation on diverse genomic and survival datasets, we demonstrate that
our strategy not only identifies gene signatures with high predictive power for
survival outcomes but can also streamlines the process for low-cost genomic
profiling. The implications of this research are profound as it offers a
scalable and effective tool for advancing personalized medicine and targeted
cancer therapies. By pushing the boundaries of gene selection methodologies,
our work contributes significantly to the ongoing efforts in cancer genomics,
promising improved diagnostic and prognostic capabilities in clinical settings.
Related papers
- GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.
The model adheres to the central dogma of molecular biology, accurately generating protein-coding sequences.
It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of promoter sequences.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - Survey and Improvement Strategies for Gene Prioritization with Large Language Models [61.24568051916653]
Large language models (LLMs) have performed well in medical exams, but their effectiveness in diagnosing rare genetic diseases has not been assessed.
We used multi-agent and Human Phenotype Ontology (HPO) classification to categorized patients based on phenotypes and solvability levels.
At baseline, GPT-4 outperformed other LLMs, achieving near 30% accuracy in ranking causal genes correctly.
arXiv Detail & Related papers (2025-01-30T23:03:03Z) - Multivariate Feature Selection and Autoencoder Embeddings of Ovarian Cancer Clinical and Genetic Data [2.973561339858947]
This study explores a data-driven approach to discovering novel clinical and genetic markers in ovarian cancer (OC)
In the autoencoder analysis, a clearer pattern emerged when using clinical features and the combination of clinical and genetic data.
Key clinical variables (such as type of surgery and neoadjuvant chemotherapy) and certain gene mutations showed strong relevance, along with low-risk genetic factors.
arXiv Detail & Related papers (2025-01-27T09:07:07Z) - Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective [24.247247851943982]
We present an iterative gene panel selection strategy that harnesses ensemble knowledge from existing gene selection algorithms to establish preliminary boundaries or prior knowledge.
We incorporate reinforcement learning through a reward function shaped by expert behavior, enabling dynamic refinement and targeted selection of gene panels.
Our results underscore the potential of this approach to advance single-cell genomics data analysis.
arXiv Detail & Related papers (2025-01-02T07:57:41Z) - Enhanced Gene Selection in Single-Cell Genomics: Pre-Filtering Synergy and Reinforced Optimization [16.491060073775884]
We introduce an iterative gene panel selection strategy applicable to clustering tasks in single-cell genomics.
Our method integrates results from other gene selection algorithms, providing valuable preliminary boundaries.
We incorporate the nature of the exploration process in reinforcement learning (RL) and its capability for continuous optimization.
arXiv Detail & Related papers (2024-06-11T16:21:33Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - A New Deep Learning and XAI-Based Algorithm for Features Selection in
Genomics [5.787117733071415]
The paper proposes a novel algorithm to perform Feature Selection on genomic-scale data.
Results of the application on a Chronic Lymphocytic Leukemia dataset evidence the effectiveness of the algorithm.
arXiv Detail & Related papers (2023-03-29T16:44:13Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We propose a new approach for dealing with high-dimensional binary classification problems that combines ideas from regularization and ensembling.
We demonstrate the good performance of our method in terms of prediction accuracy and identification of key biomarkers using several medical datasets involving common diseases such as cancer, multiple sclerosis and psoriasis.
arXiv Detail & Related papers (2021-02-17T05:57:26Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.