Related papers: Predicting fixed-sample test decisions enables anytime-valid inference

Predicting fixed-sample test decisions enables anytime-valid inference

URL: http://arxiv.org/abs/2602.13872v1
Date: Sat, 14 Feb 2026 20:17:51 GMT
Title: Predicting fixed-sample test decisions enables anytime-valid inference
Authors: Chris Holmes, Stephen Walker,
Abstract summary: We introduce a simple procedure that transforms any fixed-sample hypothesis test into an anytime-valid test.<n>We ensure Type-I error control and near-optimal power with substantial sample savings when the null hypothesis is false.<n>In areas such as clinical trials, stopping early and safely can ensure that subjects receive the best treatments and accelerate the development of effective therapies.
Score: 0.3222802562733787
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Statistical hypothesis tests typically use prespecified sample sizes, yet data often arrive sequentially. Interim analyses invalidate classical error guarantees, while existing sequential methods require rigid testing preschedules or incur substantial losses in statistical power. We introduce a simple procedure that transforms any fixed-sample hypothesis test into an anytime-valid test while ensuring Type-I error control and near-optimal power with substantial sample savings when the null hypothesis is false. At each step, the procedure predicts the probability that a classical test would reject the null hypothesis at its fixed-sample size, treating future observations as missing data under the null hypothesis. Thresholding this probability yields an anytime-valid stopping rule. In areas such as clinical trials, stopping early and safely can ensure that subjects receive the best treatments and accelerate the development of effective therapies.

Related papers

Pre-validation Revisited [79.92204034170092]
We show properties and benefits of pre-validation in prediction, inference and error estimation by simulations and applications.<n>We propose not only an analytical distribution of the test statistic for the pre-validated predictor under certain models, but also a generic bootstrap procedure to conduct inference.
arXiv Detail & Related papers (2025-05-21T00:20:14Z)
Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms [12.524536193679124]
We propose internal coherency scores that allow testing for assumption violations and finite sample errors.<n>We illustrate our coherency scores on the PC algorithm with simulated and real-world datasets.
arXiv Detail & Related papers (2025-02-20T16:44:54Z)
Prediction-Powered Causal Inferences [59.98498488132307]
We focus on Prediction-Powered Causal Inferences (PPCI)<n>We first show that conditional calibration guarantees valid PPCI at population level.<n>We then introduce a sufficient representation constraint transferring validity across experiments.
arXiv Detail & Related papers (2025-02-10T10:52:17Z)
Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [65.21599711087538]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample.<n>Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications.<n>We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z)
Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing. We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z)
Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations [44.71254888821376]
We provide the first type-I-error and expected-rejection-time guarantees under general non-data generating processes. We show how to apply our results to inference on parameters defined by estimating equations, such as average treatment effects.
arXiv Detail & Related papers (2022-12-29T18:37:08Z)
Sequential Kernelized Independence Testing [77.237958592189]
We design sequential kernelized independence tests inspired by kernelized dependence measures.<n>We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z)
Private Sequential Hypothesis Testing for Statisticians: Privacy, Error Rates, and Sample Size [24.149533870085175]
We study the sequential hypothesis testing problem under a slight variant of differential privacy, known as Renyi differential privacy. We present a new private algorithm based on Wald's Sequential Probability Ratio Test (SPRT) that also gives strong theoretical privacy guarantees.
arXiv Detail & Related papers (2022-04-10T04:15:50Z)
Online Control of the False Discovery Rate under "Decision Deadlines" [1.4213973379473654]
Online testing procedures aim to control the extent of false discoveries over a sequence of hypothesis tests. Our method controls the false discovery rate (FDR) at every stage of testing, as well as at adaptively chosen stopping times.
arXiv Detail & Related papers (2021-10-04T17:28:09Z)
Cross-validation Confidence Intervals for Test Error [83.67415139421448]
This work develops central limit theorems for crossvalidation and consistent estimators of its variance under weak stability conditions on the learning algorithm. Results are the first of their kind for the popular choice of leave-one-out cross-validation.
arXiv Detail & Related papers (2020-07-24T17:40:06Z)
Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests [2.28438857884398]
Our key theoretical contribution is a non-asymptotic bound on the discrepancy between the size of an approximate randomization test and the size of the original randomization test using noiseless data. We illustrate our theory through several examples, including tests of significance in linear regression.
arXiv Detail & Related papers (2019-08-12T16:09:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.