TxPert: Leveraging Biochemical Relationships for Out-of-Distribution Transcriptomic Perturbation Prediction
- URL: http://arxiv.org/abs/2505.14919v1
- Date: Tue, 20 May 2025 21:13:23 GMT
- Title: TxPert: Leveraging Biochemical Relationships for Out-of-Distribution Transcriptomic Perturbation Prediction
- Authors: Frederik Wenkel, Wilson Tu, Cassandra Masschelein, Hamed Shirzad, Cian Eastwood, Shawn T. Whitfield, Ihab Bendidi, Craig Russell, Liam Hodgson, Yassir El Mesbahi, Jiarui Ding, Marta M. Fay, Berton Earnshaw, Emmanuel Noutahi, Alisandra K. Denton,
- Abstract summary: We present TxPert, a new state-of-the-art method that leverages multiple biological knowledge networks to predict responses under OOD scenarios.<n>In particular, we present: (i) TxPert, a new state-of-the-art method that leverages multiple biological knowledge networks to predict responses under OOD scenarios; (ii) an in-depth analysis demonstrating the impact of graphs, model architecture, and data on performance; and (iii) an expanded benchmarking framework that strengthens evaluation standards for perturbation modeling.
- Score: 11.083533122552396
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurately predicting cellular responses to genetic perturbations is essential for understanding disease mechanisms and designing effective therapies. Yet exhaustively exploring the space of possible perturbations (e.g., multi-gene perturbations or across tissues and cell types) is prohibitively expensive, motivating methods that can generalize to unseen conditions. In this work, we explore how knowledge graphs of gene-gene relationships can improve out-of-distribution (OOD) prediction across three challenging settings: unseen single perturbations; unseen double perturbations; and unseen cell lines. In particular, we present: (i) TxPert, a new state-of-the-art method that leverages multiple biological knowledge networks to predict transcriptional responses under OOD scenarios; (ii) an in-depth analysis demonstrating the impact of graphs, model architecture, and data on performance; and (iii) an expanded benchmarking framework that strengthens evaluation standards for perturbation modeling.
Related papers
- Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations [44.619690829431214]
We train a neural network to predict distributional responses in gene expression following genetic perturbations.<n>Our model predicts gene-level histograms conditioned on perturbations and outperforms baselines in capturing higher-order statistics.
arXiv Detail & Related papers (2025-07-01T06:04:28Z) - Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z) - DISPROTBENCH: A Disorder-Aware, Task-Rich Benchmark for Evaluating Protein Structure Prediction in Realistic Biological Contexts [76.59606029593085]
DisProtBench is a benchmark for evaluating protein structure prediction models (PSPMs) under structural disorder and complex biological conditions.<n>DisProtBench spans three key axes: data complexity, task diversity, and Interpretability.<n>Results reveal significant variability in model robustness under disorder, with low-confidence regions linked to functional prediction failures.
arXiv Detail & Related papers (2025-06-18T23:58:22Z) - Contextualizing biological perturbation experiments through language [3.704686482174365]
PerturbQA is a benchmark for structured reasoning over perturbation experiments.<n>We evaluate state-of-the-art machine learning and statistical approaches for modeling perturbations.<n>As a proof of feasibility, we introduce Summer (SUMMarize, retrievE, and answeR), a simple, domain-informed LLM framework.
arXiv Detail & Related papers (2025-02-28T18:15:31Z) - Generative Intervention Models for Causal Perturbation Modeling [80.72074987374141]
In many applications, it is a priori unknown which mechanisms of a system are modified by an external perturbation.
We propose a generative intervention model (GIM) that learns to map these perturbation features to distributions over atomic interventions.
arXiv Detail & Related papers (2024-11-21T10:37:57Z) - Identifying perturbation targets through causal differential networks [23.568795598997376]
We propose a causality-inspired approach to identify variables responsible for changes to a biological system.<n>First, we infer noisy causal graphs from the observational and interventional data.<n>We then learn to map the differences between these graphs, along with additional statistical features, to sets of variables that were intervened upon.
arXiv Detail & Related papers (2024-10-04T12:48:21Z) - Large-Scale Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.<n>By employing a local-inference strategy, our approach scales with linear complexity in the number of variables, efficiently scaling up to thousands of variables.<n> Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - A deep graph model for the signed interaction prediction in biological network [1.03121181235382]
Predicting signed interactions in biological networks is crucial for understanding drug mechanisms and facilitating drug repurposing.<n>textbfRGCNTD is designed to predict both polar (e.g. activation, inhibition) and non-polar (e.g. binding, affect) chemical-gene interactions.<n>We introduce new evaluation metrics, textitAUCtextsubscriptpolarity and textitCP@500, to assess the model's ability to differentiate interaction types.
arXiv Detail & Related papers (2024-07-10T04:28:21Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges.
We first present the model that underlies most of current causal approaches to single-cell biology.
We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z) - CausalBench: A Large-scale Benchmark for Network Inference from
Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data.
CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z) - Predicting Cellular Responses with Variational Causal Inference and Refined Relational Information [28.4434795102787]
We propose a graph variational Bayesian causal inference framework to predict a cell's gene expressions under counterfactual perturbations.<n>We leverage information representing biological knowledge in the form of gene regulatory networks to aid individualized cellular response predictions.
arXiv Detail & Related papers (2022-09-30T22:13:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.