MapPFN: Learning Causal Perturbation Maps in Context
- URL: http://arxiv.org/abs/2601.21092v1
- Date: Wed, 28 Jan 2026 22:28:06 GMT
- Title: MapPFN: Learning Causal Perturbation Maps in Context
- Authors: Marvin Sextro, Weronika Kłos, Gabriel Dernbach,
- Abstract summary: We present MapPFN, a prior-data fitted network (PFN) pretrained on synthetic data generated from a prior over causal perturbations.<n>Given a set of experiments, MapPFN uses in-context learning to predict post-perturbation distributions, without gradient-based optimization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Planning effective interventions in biological systems requires treatment-effect models that adapt to unseen biological contexts by identifying their specific underlying mechanisms. Yet single-cell perturbation datasets span only a handful of biological contexts, and existing methods cannot leverage new interventional evidence at inference time to adapt beyond their training data. To meta-learn a perturbation effect estimator, we present MapPFN, a prior-data fitted network (PFN) pretrained on synthetic data generated from a prior over causal perturbations. Given a set of experiments, MapPFN uses in-context learning to predict post-perturbation distributions, without gradient-based optimization. Despite being pretrained on in silico gene knockouts alone, MapPFN identifies differentially expressed genes, matching the performance of models trained on real single-cell data. Our code and data are available at https://github.com/marvinsxtr/MapPFN.
Related papers
- Nonparametric Data Attribution for Diffusion Models [57.820618036556084]
Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs.<n>We propose a nonparametric attribution method that operates entirely on data, measuring influence via patch-level similarity between generated and training images.
arXiv Detail & Related papers (2025-10-16T03:37:16Z) - Efficient Data Selection for Training Genomic Perturbation Models [32.968559353907004]
We focus on graph neural network-based gene perturbation models.<n>We propose a subset selection method that, unlike active learning, selects the training perturbations in one shot.
arXiv Detail & Related papers (2025-03-18T12:52:03Z) - NeuroADDA: Active Discriminative Domain Adaptation in Connectomic [3.241925400160274]
We introduce NeuroADDA, a method that combines optimal domain selection with source-free active learning to adapt pretrained backbones to a new dataset.<n>NeuroADDA consistently outperforms training from scratch across diverse datasets and fine-tuning sample sizes.
arXiv Detail & Related papers (2025-03-08T12:40:30Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown Interventions [4.993565079216378]
We use a greedy algorithm called GnIES to recover the equivalence class of the data-generating model without knowledge of the intervention targets.<n>In addition, we develop a novel procedure to generate semi-synthetic data sets with known causal ground truth but distributions closely resembling those of a real data set of choice.
arXiv Detail & Related papers (2022-11-27T17:37:21Z) - Inferring probabilistic Boolean networks from steady-state gene data
samples [0.6882042556551611]
We present a method for inferring PBNs directly from real gene expression data measurements taken when the system was at a steady state.
The proposed approach does not rely on reconstructing the state evolution of the network.
We demonstrate the method on samples of real gene expression profiling data from a well-known study on metastatic melanoma.
arXiv Detail & Related papers (2022-11-11T00:39:00Z) - CausalBench: A Large-scale Benchmark for Network Inference from
Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data.
CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Inference of cell dynamics on perturbation data using adjoint
sensitivity [4.606583317143614]
Data-driven dynamic models of cell biology can be used to predict cell response to unseen perturbations.
Recent work had demonstrated the derivation of interpretable models with explicit interaction terms.
This work aims to extend the range of applicability of this model inference approach to a diversity of biological systems.
arXiv Detail & Related papers (2021-04-13T19:15:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.