Benchmarking Observational Studies with Experimental Data under
Right-Censoring
- URL: http://arxiv.org/abs/2402.15137v1
- Date: Fri, 23 Feb 2024 06:44:13 GMT
- Title: Benchmarking Observational Studies with Experimental Data under
Right-Censoring
- Authors: Ilker Demirel, Edward De Brouwer, Zeshan Hussain, Michael Oberst,
Anthony Philippakis and David Sontag
- Abstract summary: We consider two cases where censoring time is independent of time-to-event.
We show that the same test can still be used even though unbiased CATE estimation may not be possible.
We verify the effectiveness of our censoring-aware tests via semi-synthetic experiments and analyze RCT and OS data from the Women's Health Initiative study.
- Score: 18.768537827004536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Drawing causal inferences from observational studies (OS) requires
unverifiable validity assumptions; however, one can falsify those assumptions
by benchmarking the OS with experimental data from a randomized controlled
trial (RCT). A major limitation of existing procedures is not accounting for
censoring, despite the abundance of RCTs and OSes that report right-censored
time-to-event outcomes. We consider two cases where censoring time (1) is
independent of time-to-event and (2) depends on time-to-event the same way in
OS and RCT. For the former, we adopt a censoring-doubly-robust signal for the
conditional average treatment effect (CATE) to facilitate an equivalence test
of CATEs in OS and RCT, which serves as a proxy for testing if the validity
assumptions hold. For the latter, we show that the same test can still be used
even though unbiased CATE estimation may not be possible. We verify the
effectiveness of our censoring-aware tests via semi-synthetic experiments and
analyze RCT and OS data from the Women's Health Initiative study.
Related papers
- Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms [12.524536193679124]
We propose internal coherency scores that allow testing for assumption violations and finite sample errors.
We illustrate our coherency scores on the PC algorithm with simulated and real-world datasets.
arXiv Detail & Related papers (2025-02-20T16:44:54Z) - On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning [49.17494657762375]
Test-time adaptation (TTA) updates the model weights during the inference stage using testing data to enhance generalization.
Existing studies have shown that when TTA is updated with crafted adversarial test samples, the performance on benign samples can deteriorate.
We propose an effective and realistic attack method that better produces poisoned samples without access to benign samples.
arXiv Detail & Related papers (2024-10-07T01:29:19Z) - Mitigating LLM Hallucinations via Conformal Abstention [70.83870602967625]
We develop a principled procedure for determining when a large language model should abstain from responding in a general domain.
We leverage conformal prediction techniques to develop an abstention procedure that benefits from rigorous theoretical guarantees on the hallucination rate (error rate)
Experimentally, our resulting conformal abstention method reliably bounds the hallucination rate on various closed-book, open-domain generative question answering datasets.
arXiv Detail & Related papers (2024-04-04T11:32:03Z) - CenTime: Event-Conditional Modelling of Censoring in Survival Analysis [49.44664144472712]
We introduce CenTime, a novel approach to survival analysis that directly estimates the time to event.
Our method features an innovative event-conditional censoring mechanism that performs robustly even when uncensored data is scarce.
Our results indicate that CenTime offers state-of-the-art performance in predicting time-to-death while maintaining comparable ranking performance.
arXiv Detail & Related papers (2023-09-07T17:07:33Z) - A Double Machine Learning Approach to Combining Experimental and Observational Data [59.29868677652324]
We propose a double machine learning approach to combine experimental and observational studies.
Our framework tests for violations of external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z) - Falsification of Internal and External Validity in Observational Studies
via Conditional Moment Restrictions [6.9347431938654465]
Given data from both an RCT and an observational study, assumptions on internal and external validity have an observable, testable implication.
We show that expressing these CMRs with respect to the causal effect, or "causal contrast", as opposed to individual counterfactual means, provides a more reliable falsification test.
arXiv Detail & Related papers (2023-01-30T18:16:16Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Falsification before Extrapolation in Causal Effect Estimation [6.715453431174765]
Causal effects in populations are often estimated using observational datasets.
We propose a meta-algorithm that attempts to reject observational estimates that are biased.
arXiv Detail & Related papers (2022-09-27T21:47:23Z) - Correct block-design experiments mitigate temporal correlation bias in
EEG classification [68.85562949901077]
We show that the main claim in [1] is drastically overstated and their other analyses are seriously flawed by wrong methodological choices.
We investigate the influence of EEG temporal correlation on classification accuracy by testing the same models in two additional experimental settings.
arXiv Detail & Related papers (2020-11-25T22:25:21Z) - A kernel test for quasi-independence [24.127106529428335]
We consider settings in which the data of interest correspond to pairs of ordered times.
It is still of interest to determine whether there exists significant dependence beyond their ordering in time.
We propose a nonparametric statistical test of quasi-independence.
arXiv Detail & Related papers (2020-11-17T22:42:45Z) - Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event
Data [24.442094864838225]
We propose a collection of kernelized Stein discrepancy tests for time-to-event data.
Our experimental results show that our proposed methods perform better than existing tests.
arXiv Detail & Related papers (2020-08-19T12:27:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.