Lower bounds in multiple testing: A framework based on derandomized
proxies
- URL: http://arxiv.org/abs/2005.03725v1
- Date: Thu, 7 May 2020 19:59:51 GMT
- Title: Lower bounds in multiple testing: A framework based on derandomized
proxies
- Authors: Max Rabinovich and Michael I. Jordan and Martin J. Wainwright
- Abstract summary: This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
- Score: 107.69746750639584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The large bulk of work in multiple testing has focused on specifying
procedures that control the false discovery rate (FDR), with relatively less
attention being paid to the corresponding Type II error known as the false
non-discovery rate (FNR). A line of more recent work in multiple testing has
begun to investigate the tradeoffs between the FDR and FNR and to provide lower
bounds on the performance of procedures that depend on the model structure.
Lacking thus far, however, has been a general approach to obtaining lower
bounds for a broad class of models. This paper introduces an analysis strategy
based on derandomization, illustrated by applications to various concrete
models. Our main result is meta-theorem that gives a general recipe for
obtaining lower bounds on the combination of FDR and FNR. We illustrate this
meta-theorem by deriving explicit bounds for several models, including
instances with dependence, scale-transformed alternatives, and
non-Gaussian-like distributions. We provide numerical simulations of some of
these lower bounds, and show a close relation to the actual performance of the
Benjamini-Hochberg (BH) algorithm.
Related papers
- Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - A General Recipe for the Analysis of Randomized Multi-Armed Bandit
Algorithms [16.114012813668932]
We revisit two famous bandit algorithms, Minimum Empirical Divergence (MED) and Thompson Sampling (TS)
We prove that MED is optimal for all these models, but also provide a simple regret analysis of some TS algorithms for which the optimality is already known.
arXiv Detail & Related papers (2023-03-10T16:43:48Z) - A Practical Upper Bound for the Worst-Case Attribution Deviations [21.341303776931532]
Model attribution is a critical component of deep neural networks (DNNs) for its interpretability to complex models.
Recent studies bring up attention to the security of attribution methods as they are vulnerable to attribution attacks that generate similar images with dramatically different attributions.
Existing works have been investigating empirically improving the robustness of DNNs against those attacks; however, none of them explicitly quantifies the actual deviations of attributions.
In this work, for the first time, a constrained optimization problem is formulated to derive an upper bound that measures the largest dissimilarity of attributions after the samples are perturbed by any noises within a certain region
arXiv Detail & Related papers (2023-03-01T09:07:27Z) - Near-optimal multiple testing in Bayesian linear models with
finite-sample FDR control [11.011242089340438]
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures that control the False Discovery Rate (FDR)
We introduce Model-X procedures that provably control the frequentist FDR from finite samples, even when the model is misspecified.
Our proposed procedure, PoEdCe, incorporates three key ingredients: Posterior Expectation, distilled randomization test (dCRT), and the Benjamini-Hochberg procedure with e-values.
arXiv Detail & Related papers (2022-11-04T22:56:41Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Learning generative models for valid knockoffs using novel
multivariate-rank based statistics [12.528602250193206]
Rank energy (RE) is derived using theoretical results characterizing the optimal maps in the Monge's Optimal Transport (OT) problem.
We propose a variant of the RE, dubbed as soft rank energy (sRE), and its kernel variant called as soft rank maximum mean discrepancy (sRMMD)
We then use sRMMD to generate deep knockoffs and show via extensive evaluation that it is a novel and effective method to produce valid knockoffs.
arXiv Detail & Related papers (2021-10-29T18:51:19Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - A Unifying Theory of Thompson Sampling for Continuous Risk-Averse
Bandits [91.3755431537592]
This paper unifies the analysis of risk-averse Thompson sampling algorithms for the multi-armed bandit problem.
Using the contraction principle in the theory of large deviations, we prove novel concentration bounds for continuous risk functionals.
We show that a wide class of risk functionals as well as "nice" functions of them satisfy the continuity condition.
arXiv Detail & Related papers (2021-08-25T17:09:01Z) - AdaPT-GMM: Powerful and robust covariate-assisted multiple testing [0.7614628596146599]
We propose a new empirical Bayes method for co-assisted multiple testing with false discovery rate (FDR) control.
Our method refines the adaptive p-value thresholding (AdaPT) procedure by generalizing its masking scheme.
We show in extensive simulations and real data examples that our new method, which we call AdaPT-GMM, consistently delivers high power.
arXiv Detail & Related papers (2021-06-30T05:06:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.