Breaking hypothesis testing for failure rates
- URL: http://arxiv.org/abs/2001.04045v1
- Date: Mon, 13 Jan 2020 03:17:30 GMT
- Title: Breaking hypothesis testing for failure rates
- Authors: Rohit Pandey, Yingnong Dang, Gil Lapid Shafriri, Murali Chintalapati,
Aerin Kim
- Abstract summary: We describe the utility of point processes and failure rates and the most common process for modeling failure rates, the Poisson point process.
A common argument against using this test is that real world data rarely follows the Poisson point process.
We investigate what happens when the distributional assumptions of tests like these are violated and the test still applied.
- Score: 7.973062022996845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We describe the utility of point processes and failure rates and the most
common point process for modeling failure rates, the Poisson point process.
Next, we describe the uniformly most powerful test for comparing the rates of
two Poisson point processes for a one-sided test (henceforth referred to as the
"rate test"). A common argument against using this test is that real world data
rarely follows the Poisson point process. We thus investigate what happens when
the distributional assumptions of tests like these are violated and the test
still applied. We find a non-pathological example (using the rate test on a
Compound Poisson distribution with Binomial compounding) where violating the
distributional assumptions of the rate test make it perform better (lower error
rates). We also find that if we replace the distribution of the test statistic
under the null hypothesis with any other arbitrary distribution, the
performance of the test (described in terms of the false negative rate to false
positive rate trade-off) remains exactly the same. Next, we compare the
performance of the rate test to a version of the Wald test customized to the
Negative Binomial point process and find it to perform very similarly while
being much more general and versatile. Finally, we discuss the applications to
Microsoft Azure. The code for all experiments performed is open source and
linked in the introduction.
Related papers
- On Robust hypothesis testing with respect to Hellinger distance [0.0]
We study the hypothesis testing problem where the observed samples need not come from either of the specified hypotheses.<n>In such a situation, we would like our test to be robust to this misspecification and output the distribution closer in Hellinger distance.<n>Our main result is quantifying how close the underlying distribution has to be to either of the hypotheses.
arXiv Detail & Related papers (2025-10-19T08:20:43Z) - A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z) - $t$-Testing the Waters: Empirically Validating Assumptions for Reliable A/B-Testing [3.988614978933934]
A/B-tests are a cornerstone of experimental design on the web, with wide-ranging applications and use-cases.
We propose a practical method to test whether the $t$-test's assumptions are met, and the A/B-test is valid.
This provides an efficient and effective way to empirically assess whether the $t$-test's assumptions are met, and the A/B-test is valid.
arXiv Detail & Related papers (2025-02-07T09:55:24Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Model-agnostic out-of-distribution detection using combined statistical
tests [15.27980070479021]
We present simple methods for out-of-distribution detection using a trained generative model.
We combine a classical parametric test (Rao's score test) with the recently introduced typicality test.
Despite their simplicity and generality, these methods can be competitive with model-specific out-of-distribution detection algorithms.
arXiv Detail & Related papers (2022-03-02T13:32:09Z) - Significance tests of feature relevance for a blackbox learner [6.72450543613463]
We derive two consistent tests for the feature relevance of a blackbox learner.
The first evaluates a loss difference with perturbation on an inference sample.
The second splits the inference sample into two but does not require data perturbation.
arXiv Detail & Related papers (2021-03-02T00:59:19Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z) - Testing Goodness of Fit of Conditional Density Models with Kernels [16.003516725803774]
We propose two nonparametric statistical tests of goodness of fit for conditional distributions.
We show that our tests are consistent against any fixed alternative conditional model.
We demonstrate the interpretability of our test on a task of modeling the distribution of New York City's taxi drop-off location.
arXiv Detail & Related papers (2020-02-24T14:04:37Z) - The Chi-Square Test of Distance Correlation [7.748852202364896]
chi-square test is non-parametric, extremely fast, and applicable to bias-corrected distance correlation using any strong negative type metric or characteristic kernel.
We show that the underlying chi-square distribution well approximates and dominates the limiting null distribution in upper tail, prove the chi-square test can be valid and consistent for testing independence.
arXiv Detail & Related papers (2019-12-27T15:16:40Z) - Asymptotic Validity and Finite-Sample Properties of Approximate Randomization Tests [2.28438857884398]
Our key theoretical contribution is a non-asymptotic bound on the discrepancy between the size of an approximate randomization test and the size of the original randomization test using noiseless data.
We illustrate our theory through several examples, including tests of significance in linear regression.
arXiv Detail & Related papers (2019-08-12T16:09:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.