Probabilistic Model Incorporating Auxiliary Covariates to Control FDR
- URL: http://arxiv.org/abs/2210.03178v1
- Date: Thu, 6 Oct 2022 19:35:53 GMT
- Title: Probabilistic Model Incorporating Auxiliary Covariates to Control FDR
- Authors: Lin Qiu, Nils Murrugarra-Llerena, V\'itor Silva, Lin Lin, Vernon M.
Chinchilli
- Abstract summary: Controlling False Discovery Rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science.
We propose a deep Black-Box framework controlling FDR (named as NeurT-FDR) which boosts statistical power and controls FDR for multiple-hypothesis testing.
We show that NeurT-FDR makes substantially more discoveries in three real datasets compared to competitive baselines.
- Score: 6.270317798744481
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Controlling False Discovery Rate (FDR) while leveraging the side information
of multiple hypothesis testing is an emerging research topic in modern data
science. Existing methods rely on the test-level covariates while ignoring
metrics about test-level covariates. This strategy may not be optimal for
complex large-scale problems, where indirect relations often exist among
test-level covariates and auxiliary metrics or covariates. We incorporate
auxiliary covariates among test-level covariates in a deep Black-Box framework
controlling FDR (named as NeurT-FDR) which boosts statistical power and
controls FDR for multiple-hypothesis testing. Our method parametrizes the
test-level covariates as a neural network and adjusts the auxiliary covariates
through a regression framework, which enables flexible handling of
high-dimensional features as well as efficient end-to-end optimization. We show
that NeurT-FDR makes substantially more discoveries in three real datasets
compared to competitive baselines.
Related papers
- High-Dimensional False Discovery Rate Control for Dependent Variables [10.86851797584794]
We propose a dependency-aware T-Rex selector that harnesses the dependency structure among variables.
We prove that our variable penalization mechanism ensures FDR control.
We formulate a fully integrated optimal calibration algorithm that concurrently determines the parameters of the graphical model and the T-Rex framework.
arXiv Detail & Related papers (2024-01-28T22:56:16Z) - REST: Enhancing Group Robustness in DNNs through Reweighted Sparse
Training [49.581884130880944]
Deep neural network (DNN) has been proven effective in various domains.
However, they often struggle to perform well on certain minority groups during inference.
arXiv Detail & Related papers (2023-12-05T16:27:54Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Near-optimal multiple testing in Bayesian linear models with
finite-sample FDR control [11.011242089340438]
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures that control the False Discovery Rate (FDR)
We introduce Model-X procedures that provably control the frequentist FDR from finite samples, even when the model is misspecified.
Our proposed procedure, PoEdCe, incorporates three key ingredients: Posterior Expectation, distilled randomization test (dCRT), and the Benjamini-Hochberg procedure with e-values.
arXiv Detail & Related papers (2022-11-04T22:56:41Z) - Two-stage Hypothesis Tests for Variable Interactions with FDR Control [10.750902543185802]
We propose a two-stage testing procedure with false discovery rate (FDR) control, which is known as a less conservative multiple-testing correction.
We demonstrate via comprehensive simulation studies that our two-stage procedure is more efficient than the classical BH procedure, with a comparable or improved statistical power.
arXiv Detail & Related papers (2022-08-31T19:17:00Z) - FEDNEST: Federated Bilevel, Minimax, and Compositional Optimization [53.78643974257301]
Many contemporary ML problems fall under nested bilevel programming that subsumes minimax and compositional optimization.
We propose FedNest: A federated alternating gradient method to address general nested problems.
arXiv Detail & Related papers (2022-05-04T17:48:55Z) - AdaPT-GMM: Powerful and robust covariate-assisted multiple testing [0.7614628596146599]
We propose a new empirical Bayes method for co-assisted multiple testing with false discovery rate (FDR) control.
Our method refines the adaptive p-value thresholding (AdaPT) procedure by generalizing its masking scheme.
We show in extensive simulations and real data examples that our new method, which we call AdaPT-GMM, consistently delivers high power.
arXiv Detail & Related papers (2021-06-30T05:06:18Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Federated Deep AUC Maximization for Heterogeneous Data with a Constant
Communication Complexity [77.78624443410216]
We propose improved FDAM algorithms for detecting heterogeneous chest data.
A result of this paper is that the communication of the proposed algorithm is strongly independent of the number of machines and also independent of the accuracy level.
Experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets and on medical chest Xray images from different organizations.
arXiv Detail & Related papers (2021-02-09T04:05:19Z) - NeurT-FDR: Controlling FDR by Incorporating Feature Hierarchy [7.496622386458525]
We propose NeurT-FDR which boosts statistical power and controls FDR for multiple hypothesis testing.
We show that NeurT-FDR has strong FDR guarantees and makes substantially more discoveries in synthetic and real datasets.
arXiv Detail & Related papers (2021-01-24T21:55:10Z) - Lower bounds in multiple testing: A framework based on derandomized
proxies [107.69746750639584]
This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
arXiv Detail & Related papers (2020-05-07T19:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.