Related papers: Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions

Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions

URL: http://arxiv.org/abs/2301.13133v1
Date: Mon, 30 Jan 2023 18:16:16 GMT
Title: Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions
Authors: Zeshan Hussain, Ming-Chieh Shih, Michael Oberst, Ilker Demirel, David Sontag
Abstract summary: Given data from both an RCT and an observational study, assumptions on internal and external validity have an observable, testable implication. We show that expressing these CMRs with respect to the causal effect, or "causal contrast", as opposed to individual counterfactual means, provides a more reliable falsification test.
Score: 6.9347431938654465
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Randomized Controlled Trials (RCT)s are relied upon to assess new treatments, but suffer from limited power to guide personalized treatment decisions. On the other hand, observational (i.e., non-experimental) studies have large and diverse populations, but are prone to various biases (e.g. residual confounding). To safely leverage the strengths of observational studies, we focus on the problem of falsification, whereby RCTs are used to validate causal effect estimates learned from observational data. In particular, we show that, given data from both an RCT and an observational study, assumptions on internal and external validity have an observable, testable implication in the form of a set of Conditional Moment Restrictions (CMRs). Further, we show that expressing these CMRs with respect to the causal effect, or "causal contrast", as opposed to individual counterfactual means, provides a more reliable falsification test. In addition to giving guarantees on the asymptotic properties of our test, we demonstrate superior power and type I error of our approach on semi-synthetic and real world datasets. Our approach is interpretable, allowing a practitioner to visualize which subgroups in the population lead to falsification of an observational study.

Related papers

Robust estimation of heterogeneous treatment effects in randomized trials leveraging external data [1.3124513975412255]
We propose the QR-learner, a model-agnostic learner that estimates conditional average treatment effects (CATE) within the trial population.<n>It has the potential to reduce the CATE prediction mean squared error while maintaining consistency, even when the external data is not aligned with the trial.<n>We apply the methods to a real-world dataset, demonstrating improvements in both CATE estimation and statistical power for detecting heterogeneous effects.
arXiv Detail & Related papers (2025-07-04T16:01:05Z)
A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z)
Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment. First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population. We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z)
Identification of Single-Treatment Effects in Factorial Experiments [0.0]
I show that when multiple interventions are randomized in experiments, the effect any single intervention would have outside the experimental setting is not identified absent heroic assumptions. observational studies and factorial experiments provide information about potential-outcome distributions with zero and multiple interventions. I show that researchers who rely on this type of design have to justify either linearity of functional forms or specify with Directed Acyclic Graphs how variables are related in the real world.
arXiv Detail & Related papers (2024-05-16T04:01:53Z)
Detecting critical treatment effect bias in small subgroups [11.437076464287822]
We propose a novel strategy to benchmark observational studies beyond the average treatment effect. First, we design a statistical test for the null hypothesis that the treatment effects estimated from the two studies, conditioned on a set of relevant features, differ up to some tolerance. We then estimate anally valid lower bound on the maximum bias strength for any subgroup in the observational study.
arXiv Detail & Related papers (2024-04-29T17:44:28Z)
Hidden yet quantifiable: A lower bound for confounding strength using randomized trials [11.437076464287822]
Unobserved confounding can compromise causal conclusions drawn from non-randomized data. We propose a novel strategy that leverages randomized trials to quantify unobserved confounding. We show how our lower bound can correctly identify the absence and presence of unobserved confounding in a real-world setting.
arXiv Detail & Related papers (2023-12-06T19:33:34Z)
Understanding Robust Overfitting from the Feature Generalization Perspective [61.770805867606796]
Adversarial training (AT) constructs robust neural networks by incorporating adversarial perturbations into natural data. It is plagued by the issue of robust overfitting (RO), which severely damages the model's robustness. In this paper, we investigate RO from a novel feature generalization perspective.
arXiv Detail & Related papers (2023-10-01T07:57:03Z)
A Double Machine Learning Approach to Combining Experimental and Observational Data [59.29868677652324]
We propose a double machine learning approach to combine experimental and observational studies. Our framework tests for violations of external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z)
Falsification before Extrapolation in Causal Effect Estimation [6.715453431174765]
Causal effects in populations are often estimated using observational datasets. We propose a meta-algorithm that attempts to reject observational estimates that are biased.
arXiv Detail & Related papers (2022-09-27T21:47:23Z)
Variational Temporal Deconfounder for Individualized Treatment Effect Estimation from Longitudinal Observational Data [8.347630187110004]
Existing approaches for estimating treatment effects from longitudinal observational data are usually built upon a strong assumption of "unconfoundedness" We propose the Variational Temporal Deconfounder (VTD), an approach that leverages deep variational embeddings in the longitudinal setting using proxies. We test our VTD method on both synthetic and real-world clinical data, and the results show that our approach is effective when hidden confounding is the leading bias compared to other existing models.
arXiv Detail & Related papers (2022-07-23T16:43:12Z)
Assessment of Treatment Effect Estimators for Heavy-Tailed Data [70.72363097550483]
A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance. We provide a novel cross-validation-like methodology to address this challenge. We evaluate our methodology across 709 RCTs implemented in the Amazon supply chain.
arXiv Detail & Related papers (2021-12-14T17:53:01Z)
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data. We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z)
Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials. We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication. We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance. We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.