Application of Deep Learning on Single-Cell RNA-sequencing Data
Analysis: A Review
- URL: http://arxiv.org/abs/2210.05677v1
- Date: Tue, 11 Oct 2022 17:07:22 GMT
- Title: Application of Deep Learning on Single-Cell RNA-sequencing Data
Analysis: A Review
- Authors: Matthew Brendel, Chang Su, Zilong Bai, Hao Zhang, Olivier Elemento,
Fei Wang
- Abstract summary: Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously.
Deep learning, a recent advance of artificial intelligence, has also emerged as a promising tool for scRNA-seq data analysis.
- Score: 17.976898403296275
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique
to quantify the gene expression profile of thousands of single cells
simultaneously. Analysis of scRNA-seq data plays an important role in the study
of cell states and phenotypes, and has helped elucidate biological processes,
such as those occurring during development of complex organisms and improved
our understanding of disease states, such as cancer, diabetes, and COVID, among
others. Deep learning, a recent advance of artificial intelligence that has
been used to address many problems involving large datasets, has also emerged
as a promising tool for scRNA-seq data analysis, as it has a capacity to
extract informative, compact features from noisy, heterogeneous, and
high-dimensional scRNA-seq data to improve downstream analysis. The present
review aims at surveying recently developed deep learning techniques in
scRNA-seq data analysis, identifying key steps within the scRNA-seq data
analysis pipeline that have been advanced by deep learning, and explaining the
benefits of deep learning over more conventional analysis tools. Finally, we
summarize the challenges in current deep learning approaches faced within
scRNA-seq data and discuss potential directions for improvements in deep
algorithms for scRNA-seq data analysis.
Related papers
- Exploring the Potentials and Challenges of Using Large Language Models for the Analysis of Transcriptional Regulation of Long Non-coding RNAs [12.790491293672632]
Long non-coding RNAs (lncRNAs) play critical roles in gene regulation and disease mechanisms.
The complexity and diversity of lncRNA sequences, along with the limited knowledge of their functional mechanisms and the regulation of their expressions, pose significant challenges to lncRNA studies.
arXiv Detail & Related papers (2024-11-05T21:57:38Z) - Character-level Tokenizations as Powerful Inductive Biases for RNA Foundational Models [0.0]
understanding and predicting RNA behavior is a challenge due to the complexity of RNA structures and interactions.
Current RNA models have yet to match the performance observed in the protein domain.
ChaRNABERT is able to reach state-of-the-art performance on several tasks in established benchmarks.
arXiv Detail & Related papers (2024-11-05T21:56:16Z) - Comprehensive benchmarking of large language models for RNA secondary structure prediction [0.0]
RNA-LLM uses large datasets of RNA sequences to learn, in a self-supervised way, how to represent each RNA base with a semantically rich numerical vector.
Among them, predicting the secondary structure is a fundamental task for uncovering RNA functional mechanisms.
We present a comprehensive experimental analysis of several pre-trained RNA-LLM, comparing them for the RNA secondary structure prediction task in a unified deep learning framework.
arXiv Detail & Related papers (2024-10-21T17:12:06Z) - BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).
First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.
Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.
Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - scCDCG: Efficient Deep Structural Clustering for single-cell RNA-seq via Deep Cut-informed Graph Embedding [12.996418312603284]
scCDCG (single-cell RNA-seq Clustering via Deep Cut-informed Graph) is a novel framework designed for efficient and accurate clustering of scRNA-seq data.
scCDCG comprises three main components: (i) A graph embedding module utilizing deep cut-informed techniques, which effectively captures intercellular high-order structural information.
(ii) A self-supervised learning module guided by optimal transport, tailored to accommodate the unique complexities of scRNA-seq data.
arXiv Detail & Related papers (2024-04-09T09:46:17Z) - scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling [9.013834280011293]
Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research.
Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT)
This method generates virtual scRNA-seq data by leveraging a real dataset.
arXiv Detail & Related papers (2024-04-09T09:25:16Z) - scHyena: Foundation Model for Full-Length Single-Cell RNA-Seq Analysis
in Brain [46.39828178736219]
We introduce scHyena, a foundation model designed to address these challenges and enhance the accuracy of scRNA-seq analysis in the brain.
scHyena is equipped with a linear adaptor layer, the positional encoding via gene-embedding, and a bidirectional Hyena operator.
This enables us to process full-length scRNA-seq data without losing any information from the raw data.
arXiv Detail & Related papers (2023-10-04T10:30:08Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - Improving RNA Secondary Structure Design using Deep Reinforcement
Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure.
We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z) - Deep neural networks approach to microbial colony detection -- a
comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset.
The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.