Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests
- URL: http://arxiv.org/abs/1908.04218v3
- Date: Mon, 22 Apr 2024 10:33:54 GMT
- Title: Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests
- Authors: Panos Toulis,
- Abstract summary: Our key theoretical contribution is a non-asymptotic bound on the discrepancy between the size of an approximate randomization test and the size of the original randomization test using noiseless data.
We illustrate our theory through several examples, including tests of significance in linear regression.
- Score: 2.28438857884398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Randomization tests rely on simple data transformations and possess an appealing robustness property. In addition to being finite-sample valid if the data distribution is invariant under the transformation, these tests can be asymptotically valid under a suitable studentization of the test statistic, even if the invariance does not hold. However, practical implementation often encounters noisy data, resulting in approximate randomization tests that may not be as robust. In this paper, our key theoretical contribution is a non-asymptotic bound on the discrepancy between the size of an approximate randomization test and the size of the original randomization test using noiseless data. This allows us to derive novel conditions for the validity of approximate randomization tests under data invariances, while being able to leverage existing results based on studentization if the invariance does not hold. We illustrate our theory through several examples, including tests of significance in linear regression. Our theory can explain certain aspects of how randomization tests perform in small samples, addressing limitations of prior theoretical results.
Related papers
- Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Bootstrapped Edge Count Tests for Nonparametric Two-Sample Inference
Under Heterogeneity [5.8010446129208155]
We develop a new nonparametric testing procedure that accurately detects differences between the two samples.
A comprehensive simulation study and an application to detecting user behaviors in online games demonstrates the excellent non-asymptotic performance of the proposed test.
arXiv Detail & Related papers (2023-04-26T22:25:44Z) - Model-Free Sequential Testing for Conditional Independence via Testing
by Betting [8.293345261434943]
The proposed test allows researchers to analyze an incoming i.i.d. data stream with any arbitrary dependency structure.
We allow the processing of data points online as soon as they arrive and stop data acquisition once significant results are detected.
arXiv Detail & Related papers (2022-10-01T20:05:33Z) - Sequential Permutation Testing of Random Forest Variable Importance
Measures [68.8204255655161]
It is proposed here to use sequential permutation tests and sequential p-value estimation to reduce the high computational costs associated with conventional permutation tests.
The results of simulation studies confirm that the theoretical properties of the sequential tests apply.
The numerical stability of the methods is investigated in two additional application studies.
arXiv Detail & Related papers (2022-06-02T20:16:50Z) - Private Sequential Hypothesis Testing for Statisticians: Privacy, Error
Rates, and Sample Size [24.149533870085175]
We study the sequential hypothesis testing problem under a slight variant of differential privacy, known as Renyi differential privacy.
We present a new private algorithm based on Wald's Sequential Probability Ratio Test (SPRT) that also gives strong theoretical privacy guarantees.
arXiv Detail & Related papers (2022-04-10T04:15:50Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Least Squares Estimation Using Sketched Data with Heteroskedastic Errors [0.0]
We show that estimates using data sketched by random projections will behave as if the errors were homoskedastic.
Inference, including first-stage F tests for instrument relevance, can be simpler than the full sample case if the sketching scheme is appropriately chosen.
arXiv Detail & Related papers (2020-07-15T15:58:27Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.