Cross-Validated Causal Inference: a Modern Method to Combine Experimental and Observational Data
- URL: http://arxiv.org/abs/2511.00727v1
- Date: Sat, 01 Nov 2025 22:24:16 GMT
- Title: Cross-Validated Causal Inference: a Modern Method to Combine Experimental and Observational Data
- Authors: Xuelin Yang, Licong Lin, Susan Athey, Michael I. Jordan, Guido W. Imbens,
- Abstract summary: We develop new methods to integrate experimental and observational data in causal inference.<n>A full model containing the causal parameter is obtained by minimizing a weighted combination of experimental and observational losses.<n>Experiments on real and synthetic data show the efficacy and reliability of our method.
- Score: 48.72384067821617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop new methods to integrate experimental and observational data in causal inference. While randomized controlled trials offer strong internal validity, they are often costly and therefore limited in sample size. Observational data, though cheaper and often with larger sample sizes, are prone to biases due to unmeasured confounders. To harness their complementary strengths, we propose a systematic framework that formulates causal estimation as an empirical risk minimization (ERM) problem. A full model containing the causal parameter is obtained by minimizing a weighted combination of experimental and observational losses--capturing the causal parameter's validity and the full model's fit, respectively. The weight is chosen through cross-validation on the causal parameter across experimental folds. Our experiments on real and synthetic data show the efficacy and reliability of our method. We also provide theoretical non-asymptotic error bounds.
Related papers
- Prediction-Powered Causal Inferences [59.98498488132307]
We focus on Prediction-Powered Causal Inferences (PPCI)<n>We first show that conditional calibration guarantees valid PPCI at population level.<n>We then introduce a sufficient representation constraint transferring validity across experiments.
arXiv Detail & Related papers (2025-02-10T10:52:17Z) - Identification of Single-Treatment Effects in Factorial Experiments [0.0]
I show that when multiple interventions are randomized in experiments, the effect any single intervention would have outside the experimental setting is not identified absent heroic assumptions.
observational studies and factorial experiments provide information about potential-outcome distributions with zero and multiple interventions.
I show that researchers who rely on this type of design have to justify either linearity of functional forms or specify with Directed Acyclic Graphs how variables are related in the real world.
arXiv Detail & Related papers (2024-05-16T04:01:53Z) - Valid causal inference with unobserved confounding in high-dimensional
settings [0.0]
We show how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models.
We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small.
arXiv Detail & Related papers (2024-01-12T13:21:20Z) - Simultaneous inference for generalized linear models with unmeasured confounders [0.0]
We propose a unified statistical estimation and inference framework that harnesses structures and integrates linear projections into three key stages.<n>We show effective Type-I error control of $z$-tests as sample and response sizes approach infinity.
arXiv Detail & Related papers (2023-09-13T18:53:11Z) - A Double Machine Learning Approach to Combining Experimental and Observational Data [58.05402364136958]
We propose a double machine learning approach to combine experimental and observational studies.<n>Our framework proposes a falsification test for external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z) - Reconstructing Graph Diffusion History from a Single Snapshot [87.20550495678907]
We propose a novel barycenter formulation for reconstructing Diffusion history from A single SnapsHot (DASH)
We prove that estimation error of diffusion parameters is unavoidable due to NP-hardness of diffusion parameter estimation.
We also develop an effective solver named DIffusion hiTting Times with Optimal proposal (DITTO)
arXiv Detail & Related papers (2023-06-01T09:39:32Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.