Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences
- URL: http://arxiv.org/abs/2502.06343v2
- Date: Fri, 23 May 2025 15:16:33 GMT
- Title: Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences
- Authors: Riccardo Cadei, Ilker Demirel, Piersilvio De Bartolomeis, Lukas Lindorfer, Sylvia Cremer, Cordelia Schmid, Francesco Locatello,
- Abstract summary: We focus on Prediction-Powered Causal Inferences (PPCI)<n> PPCI estimates the treatment effect in a target experiment with unlabeled factual outcomes, retrievable zero-shot from a pre-trained model.<n>We validate our method on synthetic and real-world scientific data, offering solutions to instances not solvable by vanilla Empirical Risk Minimization.
- Score: 56.23412698865433
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In many scientific domains, the cost of data annotation limits the scale and pace of experimentation. Yet, modern machine learning systems offer a promising alternative, provided their predictions yield correct conclusions. We focus on Prediction-Powered Causal Inferences (PPCI), i.e., estimating the treatment effect in a target experiment with unlabeled factual outcomes, retrievable zero-shot from a pre-trained model. We first identify the conditional calibration property to guarantee valid PPCI at population level. Then, we introduce causal lifting, a new causal lifting constraint transferring validity across experiments, which we propose to enforce in practice in Deconfounded Empirical Risk Minimization, our new model-agnostic training objective. We validate our method on synthetic and real-world scientific data, offering solutions to instances not solvable by vanilla Empirical Risk Minimization and invariant training. In particular, we solve zero-shot PPCI on the ISTAnt dataset for the first time, fine-tuning a foundational model on our replica dataset of their ecological experiment with a different recording platform and treatment.
Related papers
- Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments [5.5855749614100825]
This paper addresses the hypothesis that leveraging multiple pre-trained models can mitigate this recall reduction.<n>We formulate the challenge of identifying and managing conflicting predictions from various models as a consistency-based abduction problem.<n>Our results validate the use of consistency-based abduction as an effective mechanism to robustly integrate knowledge from multiple imperfect models in challenging, novel scenarios.
arXiv Detail & Related papers (2025-05-25T23:17:47Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Learning Counterfactual Outcomes Under Rank Preservation [32.213816786727826]
We propose a principled approach for identifying and estimating the counterfactual outcome.<n>Our theoretical analysis shows that the rank preservation assumption is not stronger than the homogeneity and strict monotonicity assumptions.
arXiv Detail & Related papers (2025-02-10T12:36:57Z) - Adaptive Sampling to Reduce Epistemic Uncertainty Using Prediction Interval-Generation Neural Networks [0.0]
This paper presents an adaptive sampling approach designed to reduce epistemic uncertainty in predictive models.<n>Our primary contribution is the development of a metric that estimates potential epistemic uncertainty.<n>A batch sampling strategy based on Gaussian processes (GPs) is also proposed.<n>We test our approach on three unidimensional synthetic problems and a multi-dimensional dataset based on an agricultural field for selecting experimental fertilizer rates.
arXiv Detail & Related papers (2024-12-13T21:21:47Z) - Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation [9.950524371154394]
We propose a new misspecification measure that can be trained in an unsupervised fashion and reliably detects model misspecification at test time.
We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators.
arXiv Detail & Related papers (2024-06-05T11:30:16Z) - Prediction-powered Generalization of Causal Inferences [6.43357871718189]
We show how the limited size of trials makes generalization a statistically infeasible task.
We develop generalization algorithms that supplement the trial data with a prediction model learned from an additional observational study.
arXiv Detail & Related papers (2024-06-05T02:44:14Z) - Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation
of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data.
This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis.
To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Intervention Generalization: A View from Factor Graph Models [7.117681268784223]
We take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system.
A postulated $textitinterventional factor model$ (IFM) may not always be informative, but it conveniently abstracts away a need for explicitly modeling unmeasured confounding and feedback mechanisms.
arXiv Detail & Related papers (2023-06-06T21:44:23Z) - Generative Causal Representation Learning for Out-of-Distribution Motion
Forecasting [13.99348653165494]
We propose Generative Causal Learning Representation to facilitate knowledge transfer under distribution shifts.
While we evaluate the effectiveness of our proposed method in human trajectory prediction models, GCRL can be applied to other domains as well.
arXiv Detail & Related papers (2023-02-17T00:30:44Z) - Causal Inference under Data Restrictions [0.0]
This dissertation focuses on modern causal inference under uncertainty and data restrictions.
It includes applications to neoadjuvant clinical trials, distributed data networks, and robust individualized decision making.
arXiv Detail & Related papers (2023-01-20T20:14:32Z) - Trust Your $\nabla$: Gradient-based Intervention Targeting for Causal Discovery [49.084423861263524]
In this work, we propose a novel Gradient-based Intervention Targeting method, abbreviated GIT.
GIT 'trusts' the gradient estimator of a gradient-based causal discovery framework to provide signals for the intervention acquisition function.
We provide extensive experiments in simulated and real-world datasets and demonstrate that GIT performs on par with competitive baselines.
arXiv Detail & Related papers (2022-11-24T17:04:45Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - Variational Learning of Individual Survival Distributions [21.40142425105635]
We introduce a variational time-to-event prediction model, named Variational Survival Inference (VSI), which builds upon recent advances in distribution learning techniques and deep neural networks.
To validate effectiveness of our approach, an extensive set of experiments on both synthetic and real-world datasets is carried out, showing improved performance relative to competing solutions.
arXiv Detail & Related papers (2020-03-09T22:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.