Topological Data Analysis in Time Series: Temporal Filtration and
Application to Single-Cell Genomics
- URL: http://arxiv.org/abs/2204.14048v1
- Date: Fri, 29 Apr 2022 12:46:14 GMT
- Title: Topological Data Analysis in Time Series: Temporal Filtration and
Application to Single-Cell Genomics
- Authors: Baihan Lin
- Abstract summary: We propose the single-cell topological simplicial analysis (scTSA)
Applying this approach to the single-cell gene expression profiles from local networks of cells reveals a previously unseen topology of cellular ecology.
Benchmarked on the single-cell RNA-seq data of zebrafish embryogenesis spanning 38,731 cells, 25 cell types and 12 time steps, our approach highlights the gastrulation as the most critical stage.
- Score: 13.173307471333619
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The absence of a conventional association between the cell-cell cohabitation
and its emergent dynamics into cliques during development has hindered our
understanding of how cell populations proliferate, differentiate, and compete,
i.e. the cell ecology. With the recent advancement of the single-cell
RNA-sequencing (RNA-seq), we can potentially describe such a link by
constructing network graphs that characterize the similarity of the gene
expression profiles of the cell-specific transcriptional programs, and
analyzing these graphs systematically using the summary statistics informed by
the algebraic topology. We propose the single-cell topological simplicial
analysis (scTSA). Applying this approach to the single-cell gene expression
profiles from local networks of cells in different developmental stages with
different outcomes reveals a previously unseen topology of cellular ecology.
These networks contain an abundance of cliques of single-cell profiles bound
into cavities that guide the emergence of more complicated habitation forms. We
visualize these ecological patterns with topological simplicial architectures
of these networks, compared with the null models. Benchmarked on the
single-cell RNA-seq data of zebrafish embryogenesis spanning 38,731 cells, 25
cell types and 12 time steps, our approach highlights the gastrulation as the
most critical stage, consistent with consensus in developmental biology. As a
nonlinear, model-independent, and unsupervised framework, our approach can also
be applied to tracing multi-scale cell lineage, identifying critical stages, or
creating pseudo-time series.
Related papers
- Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen [76.02070962797794]
We present Cell Flow for Generation, a flow-based conditional generative model for multi-modal single-cell counts.
Our results suggest improved recovery of crucial biological data characteristics while accounting for novel generative tasks.
arXiv Detail & Related papers (2024-07-16T14:05:03Z) - FlowCyt: A Comparative Study of Deep Learning Approaches for Multi-Class Classification in Flow Cytometry Benchmarking [1.6712896227173808]
FlowCyt is the first comprehensive benchmark for multi-class single-cell classification in flowencoded data.
The dataset comprises bone marrow samples from 30 patients, with each cell characterized by twelve markers.
arXiv Detail & Related papers (2024-02-28T15:01:59Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Regression-Based Analysis of Multimodal Single-Cell Data Integration
Strategies [0.0]
Multimodal single-cell technologies enable the simultaneous collection of diverse data types from individual cells.
This study highlights the exceptional performance of Echo State Networks, boasting a remarkable correlation score of 0.94.
These findings hold promise for advancing comprehension of cellular differentiation and function, leveraging the potential of Machine Learning.
arXiv Detail & Related papers (2023-11-21T16:31:27Z) - Mixed Models with Multiple Instance Learning [51.440557223100164]
We introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL)
Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets.
arXiv Detail & Related papers (2023-11-04T16:42:42Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - Topology-Guided Multi-Class Cell Context Generation for Digital
Pathology [28.43244574309888]
We introduce several mathematical tools from spatial statistics and topological data analysis.
We generate high quality multi-class cell layouts for the first time.
We show that the topology-rich cell layouts can be used for data augmentation and improve the performance of downstream tasks such as cell classification.
arXiv Detail & Related papers (2023-04-05T07:01:34Z) - Learning Causal Representations of Single Cells via Sparse Mechanism
Shift Modeling [3.2435888122704037]
We propose a deep generative model of single-cell gene expression data for which each perturbation is treated as an intervention targeting an unknown, but sparse, subset of latent variables.
We benchmark these methods on simulated single-cell data to evaluate their performance at latent units recovery, causal target identification and out-of-domain generalization.
arXiv Detail & Related papers (2022-11-07T15:47:40Z) - Granger causal inference on DAGs identifies genomic loci regulating
transcription [77.58911272503771]
GrID-Net is a framework based on graph neural networks with lagged message passing for Granger causal inference on DAG-structured systems.
Our application is the analysis of single-cell multimodal data to identify genomic loci that mediate the regulation of specific genes.
arXiv Detail & Related papers (2022-10-18T21:15:10Z) - A biology-driven deep generative model for cell-type annotation in
cytometry [0.0]
We introduce Scyan, a Single-cell Cytometry Network that automatically annotates cell types using only prior expert knowledge.
Scyan significantly outperforms the related state-of-the-art models on multiple public datasets while being faster and interpretable.
In addition, Scyan overcomes several complementary tasks such as batch-effect removal, debarcoding, and population discovery.
arXiv Detail & Related papers (2022-08-11T10:50:44Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.