Modeling variable guide efficiency in pooled CRISPR screens with ContrastiveVI+
- URL: http://arxiv.org/abs/2411.08072v1
- Date: Mon, 11 Nov 2024 19:16:34 GMT
- Title: Modeling variable guide efficiency in pooled CRISPR screens with ContrastiveVI+
- Authors: Ethan Weinberger, Ryan Conrad, Tal Ashuach,
- Abstract summary: ContrastiveVI+ is a generative modeling framework that disentangles perturbation-induced from non-perturbation-related variations.
It is applied to three large-scale Perturb-seq datasets.
- Score: 1.2289361708127877
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Genetic screens mediated via CRISPR-Cas9 combined with high-content readouts have emerged as powerful tools for biological discovery. However, computational analyses of these screens come with additional challenges beyond those found with standard scRNA-seq analyses. For example, perturbation-induced variations of interest may be subtle and masked by other dominant source of variation shared with controls, and variable guide efficiency results in some cells not undergoing genetic perturbation despite expressing a guide RNA. While a number of methods have been developed to address the former problem by explicitly disentangling perturbation-induced variations from those shared with controls, less attention has been paid to the latter problem of noisy perturbation labels. To address this issue, here we propose ContrastiveVI+, a generative modeling framework that both disentangles perturbation-induced from non-perturbation-related variations while also inferring whether cells truly underwent genomic edits. Applied to three large-scale Perturb-seq datasets, we find that ContrastiveVI+ better recovers known perturbation-induced variations compared to previous methods while successfully identifying cells that escaped the functional consequences of guide RNA expression. An open-source implementation of our model is available at \url{https://github.com/insitro/contrastive_vi_plus}.
Related papers
- Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z) - Mixed Effects Deep Learning for the interpretable analysis of single cell RNA sequencing data by quantifying and visualizing batch effects [6.596656267996196]
Single-cell RNA sequencing (scRNA-seq) data are often confounded by technical or biological batch effects.
Existing deep learning models mitigate these effects but often discard batch-specific information.
We propose a Mixed Effects Deep Learning (MEDL) autoencoder framework that separately models batch-invariant (fixed effects) and batch-specific (random effects) components.
arXiv Detail & Related papers (2024-11-11T00:10:48Z) - Identifying perturbation targets through causal differential networks [23.568795598997376]
We propose a causality-inspired approach to identify variables responsible for changes to a biological system.
First, we infer noisy causal graphs from the observational and interventional data.
We then learn to map the differences between these graphs, along with additional statistical features, to sets of variables that were intervened upon.
arXiv Detail & Related papers (2024-10-04T12:48:21Z) - Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.
We employ a neural network trained to identify causality through supervised learning on simulated data.
Empirical results demonstrate the effectiveness of our method in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen [76.02070962797794]
We present Cell Flow for Generation, a flow-based conditional generative model for multi-modal single-cell counts.
Our results suggest improved recovery of crucial biological data characteristics while accounting for novel generative tasks.
arXiv Detail & Related papers (2024-07-16T14:05:03Z) - Predicting loss-of-function impact of genetic mutations: a machine
learning approach [0.0]
This paper aims to train machine learning models on the attributes of a genetic mutation to predict LoFtool scores.
These attributes included, but were not limited to, the position of a mutation on a chromosome, changes in amino acids, and changes in codons caused by the mutation.
Models were evaluated using five-fold cross-validated averages of r-squared, mean squared error, root mean squared error, mean absolute error, and explained variance.
arXiv Detail & Related papers (2024-01-26T19:27:38Z) - GE-AdvGAN: Improving the transferability of adversarial samples by
gradient editing-based adversarial generative model [69.71629949747884]
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data.
In this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples.
arXiv Detail & Related papers (2024-01-11T16:43:16Z) - Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning
and Autoregression [70.78523583702209]
We study training instabilities of behavior cloning with deep neural networks.
We observe that minibatch SGD updates to the policy network during training result in sharp oscillations in long-horizon rewards.
arXiv Detail & Related papers (2023-10-17T17:39:40Z) - DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with
GFlowNets [81.75973217676986]
Gene regulatory networks (GRN) describe interactions between genes and their products that control gene expression and cellular function.
Existing methods either focus on challenge (1), identifying cyclic structure from dynamics, or on challenge (2) learning complex Bayesian posteriors over DAGs, but not both.
In this paper we leverage the fact that it is possible to estimate the "velocity" of gene expression with RNA velocity techniques to develop an approach that addresses both challenges.
arXiv Detail & Related papers (2023-02-08T16:36:40Z) - Score-based Causal Representation Learning with Interventions [54.735484409244386]
This paper studies the causal representation learning problem when latent causal variables are observed indirectly.
The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables.
arXiv Detail & Related papers (2023-01-19T18:39:48Z) - Learning Causal Representations of Single Cells via Sparse Mechanism
Shift Modeling [3.2435888122704037]
We propose a deep generative model of single-cell gene expression data for which each perturbation is treated as an intervention targeting an unknown, but sparse, subset of latent variables.
We benchmark these methods on simulated single-cell data to evaluate their performance at latent units recovery, causal target identification and out-of-domain generalization.
arXiv Detail & Related papers (2022-11-07T15:47:40Z) - InForecaster: Forecasting Influenza Hemagglutinin Mutations Through the
Lens of Anomaly Detection [3.5213888068272197]
anomaly detection (AD) is a well-established field in Machine Learning (ML)
We propose to tackle this challenge through anomaly detection (AD)
We conduct a large number of experiments on four publicly available datasets.
arXiv Detail & Related papers (2022-10-25T02:08:09Z) - Gene Regulatory Network Inference with Latent Force Models [1.2691047660244335]
Delays in protein synthesis cause a confounding effect when constructing Gene Regulatory Networks (GRNs) from RNA-sequencing time-series data.
We present a model which incorporates translation delays by combining mechanistic equations and Bayesian approaches to fit to experimental data.
arXiv Detail & Related papers (2020-10-06T09:03:34Z) - GANs with Variational Entropy Regularizers: Applications in Mitigating
the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples.
GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution.
We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.