Causal machine learning for single-cell genomics
- URL: http://arxiv.org/abs/2310.14935v1
- Date: Mon, 23 Oct 2023 13:35:24 GMT
- Title: Causal machine learning for single-cell genomics
- Authors: Alejandro Tejada-Lapuerta, Paul Bertin, Stefan Bauer, Hananeh Aliee,
Yoshua Bengio, Fabian J. Theis
- Abstract summary: We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
- Score: 94.28105176231739
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advances in single-cell omics allow for unprecedented insights into the
transcription profiles of individual cells. When combined with large-scale
perturbation screens, through which specific biological mechanisms can be
targeted, these technologies allow for measuring the effect of targeted
perturbations on the whole transcriptome. These advances provide an opportunity
to better understand the causative role of genes in complex biological
processes such as gene regulation, disease progression or cellular development.
However, the high-dimensional nature of the data, coupled with the intricate
complexity of biological systems renders this task nontrivial. Within the
machine learning community, there has been a recent increase of interest in
causality, with a focus on adapting established causal techniques and
algorithms to handle high-dimensional data. In this perspective, we delineate
the application of these methodologies within the realm of single-cell genomics
and their challenges. We first present the model that underlies most of current
causal approaches to single-cell biology and discuss and challenge the
assumptions it entails from the biological point of view. We then identify open
problems in the application of causal approaches to single-cell data:
generalising to unseen environments, learning interpretable models, and
learning causal models of dynamics. For each problem, we discuss how various
research directions - including the development of computational approaches and
the adaptation of experimental protocols - may offer ways forward, or on the
contrary pose some difficulties. With the advent of single cell atlases and
increasing perturbation data, we expect causal models to become a crucial tool
for informed experimental design.
Related papers
- How to Build the Virtual Cell with Artificial Intelligence: Priorities and Opportunities [46.671834972945874]
We propose a vision of leveraging advances in AI to construct virtual cells.
We discuss desired capabilities of such AI Virtual Cells, including generating universal representations of biological entities.
We envision a future where AI Virtual Cells help identify new drug targets, predict cellular responses to perturbations, as well as scale hypothesis exploration.
arXiv Detail & Related papers (2024-09-18T02:41:50Z) - Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.
We employ a neural network trained to identify causality through supervised learning on simulated data.
Empirical results demonstrate the effectiveness of our method in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - Optimal Transport for Latent Integration with An Application to Heterogeneous Neuronal Activity Data [1.5311478638611091]
We propose a novel heterogeneous data integration framework based on optimal transport to extract shared patterns in complex biological processes.
Our approach is effective even with a small number of subjects, and does not require auxiliary matching information for the alignment.
arXiv Detail & Related papers (2024-06-27T04:29:21Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Regression-Based Analysis of Multimodal Single-Cell Data Integration
Strategies [0.0]
Multimodal single-cell technologies enable the simultaneous collection of diverse data types from individual cells.
This study highlights the exceptional performance of Echo State Networks, boasting a remarkable correlation score of 0.94.
These findings hold promise for advancing comprehension of cellular differentiation and function, leveraging the potential of Machine Learning.
arXiv Detail & Related papers (2023-11-21T16:31:27Z) - The CausalBench challenge: A machine learning contest for gene network
inference from single-cell perturbation data [18.706823808393402]
CausalBench Challenge was an initiative to advance the state of the art in constructing gene-gene interaction networks.
The winning solutions significantly improved performance compared to previous baselines.
arXiv Detail & Related papers (2023-08-29T15:54:15Z) - Learning Causal Representations of Single Cells via Sparse Mechanism
Shift Modeling [3.2435888122704037]
We propose a deep generative model of single-cell gene expression data for which each perturbation is treated as an intervention targeting an unknown, but sparse, subset of latent variables.
We benchmark these methods on simulated single-cell data to evaluate their performance at latent units recovery, causal target identification and out-of-domain generalization.
arXiv Detail & Related papers (2022-11-07T15:47:40Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.