N-ACT: An Interpretable Deep Learning Model for Automatic Cell Type and
Salient Gene Identification
- URL: http://arxiv.org/abs/2206.04047v1
- Date: Sun, 8 May 2022 18:13:28 GMT
- Title: N-ACT: An Interpretable Deep Learning Model for Automatic Cell Type and
Salient Gene Identification
- Authors: A. Ali Heydari, Oscar A. Davalos, Katrina K. Hoyer, Suzanne S. Sindi
- Abstract summary: A major limitation in most scRNAseq analysis pipelines is the reliance on manual annotations to determine cell identities.
N-ACT is the first-of-its-kind interpretable deep neural network for ACTI utilizing neural-attention to detect salient genes for use in cell-type identification.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Single-cell RNA sequencing (scRNAseq) is rapidly advancing our understanding
of cellular composition within complex tissues and organisms. A major
limitation in most scRNAseq analysis pipelines is the reliance on manual
annotations to determine cell identities, which are time consuming, subjective,
and require expertise. Given the surge in cell sequencing, supervised
methods-especially deep learning models-have been developed for automatic cell
type identification (ACTI), which achieve high accuracy and scalability.
However, all existing deep learning frameworks for ACTI lack interpretability
and are used as "black-box" models. We present N-ACT (Neural-Attention for Cell
Type identification): the first-of-its-kind interpretable deep neural network
for ACTI utilizing neural-attention to detect salient genes for use in
cell-type identification. We compare N-ACT to conventional annotation methods
on two previously manually annotated data sets, demonstrating that N-ACT
accurately identifies marker genes and cell types in an unsupervised manner,
while performing comparably on multiple data sets to current state-of-the-art
model in traditional supervised ACTI.
Related papers
- eDOC: Explainable Decoding Out-of-domain Cell Types with Evidential Learning [7.036161839497915]
Single-cell RNA-seq (scRNA-seq) technology is a powerful tool for unraveling the complexity of biological systems.
Cell Type CTA (CTA) is one of essential and fundamental tasks in scRNA-seq data analysis.
We develop a new method, eDOC, to address aforementioned challenges.
arXiv Detail & Related papers (2024-10-30T20:15:36Z) - Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning [10.44434676119443]
Inferring gene regulatory networks (GRNs) from single-cell RNA sequencing (scRNA-seq) data is a complex challenge.
In this study, we tackle this challenge by leveraging the single-cell BERT-based pre-trained transformer model (scBERT)
We introduce a novel joint graph learning approach that combines the rich contextual representations learned by single-cell language models with the structured knowledge encoded in GRNs.
arXiv Detail & Related papers (2024-07-25T16:42:08Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - scBiGNN: Bilevel Graph Representation Learning for Cell Type
Classification from Single-cell RNA Sequencing Data [62.87454293046843]
Graph neural networks (GNNs) have been widely used for automatic cell type classification.
scBiGNN comprises two GNN modules to identify cell types.
scBiGNN outperforms a variety of existing methods for cell type classification from scRNA-seq data.
arXiv Detail & Related papers (2023-12-16T03:54:26Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Epigenomic language models powered by Cerebras [0.0]
Epigenomic BERT (or EBERT) learns representations based on both DNA sequence and paired epigenetic state inputs.
We show EBERT's transfer learning potential by demonstrating strong performance on a cell type-specific transcription factor binding prediction task.
Our fine-tuned model exceeds state of the art performance on 4 of 13 evaluation datasets from ENCODE-DREAM benchmarks and earns an overall rank of 3rd on the challenge leaderboard.
arXiv Detail & Related papers (2021-12-14T17:23:42Z) - CloudPred: Predicting Patient Phenotypes From Single-cell RNA-seq [6.669618903574761]
Single-cell RNA sequencing (scRNA-seq) has the potential to provide powerful, high-resolution signatures to inform disease prognosis and precision medicine.
This paper develops an interpretable machine learning algorithm, CloudPred, to predict individuals' disease phenotypes from their scRNA-seq data.
arXiv Detail & Related papers (2021-10-13T22:41:30Z) - Multi-modal Self-supervised Pre-training for Regulatory Genome Across
Cell Types [75.65676405302105]
We propose a simple yet effective approach for pre-training genome data in a multi-modal and self-supervised manner, which we call GeneBERT.
We pre-train our model on the ATAC-seq dataset with 17 million genome sequences.
arXiv Detail & Related papers (2021-10-11T12:48:44Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Cell Type Identification from Single-Cell Transcriptomic Data via
Semi-supervised Learning [2.4271601178529063]
Cell type identification from single-cell transcriptomic data is a common goal of single-cell RNA sequencing (scRNAseq) data analysis.
We propose a semi-supervised learning model to use unlabeled scRNAseq cells and limited amount of labeled scRNAseq cells to implement cell identification.
It is observed that the proposed model is able to achieve encouraging performance by learning on very limited amount of labeled scRNAseq cells.
arXiv Detail & Related papers (2020-05-06T19:15:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.