Related papers: Enhanced Gene Selection in Single-Cell Genomics: Pre-Filtering Synergy and Reinforced Optimization

Enhanced Gene Selection in Single-Cell Genomics: Pre-Filtering Synergy and Reinforced Optimization

URL: http://arxiv.org/abs/2406.07418v1
Date: Tue, 11 Jun 2024 16:21:33 GMT
Title: Enhanced Gene Selection in Single-Cell Genomics: Pre-Filtering Synergy and Reinforced Optimization
Authors: Weiliang Zhang, Zhen Meng, Dongjie Wang, Min Wu, Kunpeng Liu, Yuanchun Zhou, Meng Xiao,
Abstract summary: We introduce an iterative gene panel selection strategy applicable to clustering tasks in single-cell genomics. Our method integrates results from other gene selection algorithms, providing valuable preliminary boundaries. We incorporate the nature of the exploration process in reinforcement learning (RL) and its capability for continuous optimization.
Score: 16.491060073775884
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in single-cell genomics necessitate precision in gene panel selection to interpret complex biological data effectively. Those methods aim to streamline the analysis of scRNA-seq data by focusing on the most informative genes that contribute significantly to the specific analysis task. Traditional selection methods, which often rely on expert domain knowledge, embedded machine learning models, or heuristic-based iterative optimization, are prone to biases and inefficiencies that may obscure critical genomic signals. Recognizing the limitations of traditional methods, we aim to transcend these constraints with a refined strategy. In this study, we introduce an iterative gene panel selection strategy that is applicable to clustering tasks in single-cell genomics. Our method uniquely integrates results from other gene selection algorithms, providing valuable preliminary boundaries or prior knowledge as initial guides in the search space to enhance the efficiency of our framework. Furthermore, we incorporate the stochastic nature of the exploration process in reinforcement learning (RL) and its capability for continuous optimization through reward-based feedback. This combination mitigates the biases inherent in the initial boundaries and harnesses RL's adaptability to refine and target gene panel selection dynamically. To illustrate the effectiveness of our method, we conducted detailed comparative experiments, case studies, and visualization analysis.

Related papers

Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z)
Biological Pathway Guided Gene Selection Through Collaborative Reinforcement Learning [25.2831953927341]
We propose a novel framework that integrates statistical selection with biological pathway knowledge using multi-agent reinforcement learning (MARL)<n>Our framework incorporates pathway knowledge through Graph Neural Network-based state representations, a reward mechanism combining prediction performance with gene centrality and pathway coverage, and collaborative learning strategies using shared memory and a centralized critic component.
arXiv Detail & Related papers (2025-05-30T03:01:07Z)
GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters. Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks. It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z)
Learning Evolution via Optimization Knowledge Adaptation [50.280704114978384]
Evolutionary algorithms (EAs) maintain populations through evolutionary operators to discover solutions for complex tasks. We introduce an Optimization Knowledge Adaptation Evolutionary Model (OKAEM) to enhance its optimization capabilities. OKAEM exploits prior knowledge for significant performance gains across various knowledge transfer settings. It is capable of emulating principles of natural selection and genetic recombination.
arXiv Detail & Related papers (2025-01-04T05:35:21Z)
Knowledge-Guided Biomarker Identification for Label-Free Single-Cell RNA-Seq Data: A Reinforcement Learning Perspective [24.247247851943982]
We present an iterative gene panel selection strategy that harnesses ensemble knowledge from existing gene selection algorithms to establish preliminary boundaries or prior knowledge. We incorporate reinforcement learning through a reward function shaped by expert behavior, enabling dynamic refinement and targeted selection of gene panels. Our results underscore the potential of this approach to advance single-cell genomics data analysis.
arXiv Detail & Related papers (2025-01-02T07:57:41Z)
Optimizing Feature Selection with Genetic Algorithms: A Review of Methods and Applications [4.395397502990339]
Genetic algorithms (GAs) have been proposed to provide remedies for drawbacks by avoiding local optima and improving the selection process itself. This manuscript presents a sweeping review on GA-based feature selection techniques in applications and their effectiveness across different domains.
arXiv Detail & Related papers (2024-09-05T22:28:42Z)
CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments [51.41735920759667]
Large Language Models (LLMs) have shown promise in various tasks, but they often lack specific knowledge and struggle to accurately solve biological design problems. In this work, we introduce CRISPR-GPT, an LLM agent augmented with domain knowledge and external tools to automate and enhance the design process of CRISPR-based gene-editing experiments.
arXiv Detail & Related papers (2024-04-27T22:59:17Z)
Advancing Gene Selection in Oncology: A Fusion of Deep Learning and Sparsity for Precision Gene Selection [4.093503153499691]
This paper introduces two gene selection strategies for deep learning-based survival prediction models. The first strategy uses a sparsity-inducing method while the second one uses importance based gene selection for identifying relevant genes.
arXiv Detail & Related papers (2024-03-04T10:44:57Z)
DNA Sequence Classification with Compressors [0.0]
Our study introduces a novel adaptation of Jiang et al.'s compressor-based, parameter-free classification method, specifically tailored for DNA sequence analysis. Not only does this method align with the current state-of-the-art in terms of accuracy, but it also offers a more resource-efficient alternative to traditional machine learning methods.
arXiv Detail & Related papers (2024-01-25T09:17:19Z)
Single-Cell Deep Clustering Method Assisted by Exogenous Gene Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells. During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation. This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z)
Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges. We first present the model that underlies most of current causal approaches to single-cell biology. We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z)
A New Deep Learning and XAI-Based Algorithm for Features Selection in Genomics [5.787117733071415]
The paper proposes a novel algorithm to perform Feature Selection on genomic-scale data. Results of the application on a Chronic Lymphocytic Leukemia dataset evidence the effectiveness of the algorithm.
arXiv Detail & Related papers (2023-03-29T16:44:13Z)
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images. Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting. This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z)
Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We introduce a novel approach to high-dimensional binary classification that integrates regularization with ensembling techniques.<n>In medical genomics applications, our approach identifies critical biomarkers overlooked by competing methods.
arXiv Detail & Related papers (2021-02-17T05:57:26Z)
Complexity-based speciation and genotype representation for neuroevolution [81.21462458089142]
This paper introduces a speciation principle for neuroevolution where evolving networks are grouped into species based on the number of hidden neurons. The proposed speciation principle is employed in several techniques designed to promote and preserve diversity within species and in the ecosystem as a whole.
arXiv Detail & Related papers (2020-10-11T06:26:56Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.