ARK: Robust Knockoffs Inference with Coupling
- URL: http://arxiv.org/abs/2307.04400v2
- Date: Tue, 4 Jun 2024 23:50:58 GMT
- Title: ARK: Robust Knockoffs Inference with Coupling
- Authors: Yingying Fan, Lan Gao, Jinchi Lv,
- Abstract summary: We study the robustness of the model-X knockoffs framework with respect to the misspecified or estimated feature distribution.
A key technique is to couple the approximate knockoffs procedure with the model-X knockoffs procedure so that random variables in these two procedures can be close in realizations.
We prove that if such a coupled model-X knockoffs procedure exists, the approximate knockoffs procedure can achieve the FDR or $k$-FWER control at the target level.
- Score: 7.288274235236948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the robustness of the model-X knockoffs framework with respect to the misspecified or estimated feature distribution. We achieve such a goal by theoretically studying the feature selection performance of a practically implemented knockoffs algorithm, which we name as the approximate knockoffs (ARK) procedure, under the measures of the false discovery rate (FDR) and $k$-familywise error rate ($k$-FWER). The approximate knockoffs procedure differs from the model-X knockoffs procedure only in that the former uses the misspecified or estimated feature distribution. A key technique in our theoretical analyses is to couple the approximate knockoffs procedure with the model-X knockoffs procedure so that random variables in these two procedures can be close in realizations. We prove that if such coupled model-X knockoffs procedure exists, the approximate knockoffs procedure can achieve the asymptotic FDR or $k$-FWER control at the target level. We showcase three specific constructions of such coupled model-X knockoff variables, verifying their existence and justifying the robustness of the model-X knockoffs framework. Additionally, we formally connect our concept of knockoff variable coupling to a type of Wasserstein distance.
Related papers
- Asymptotic FDR Control with Model-X Knockoffs: Is Moments Matching Sufficient? [6.6716279375012295]
We propose a unified theoretical framework for studying the robustness of the model-X knockoffs framework.
For the first time in the literature, our theoretical results justify formally the effectiveness and inference of the Gaussian knockoffs generator.
arXiv Detail & Related papers (2025-02-09T17:36:00Z) - Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection [14.840211139848275]
"Deep Dependency Regularized Knockoff (DeepDRK)" is a distribution-free deep learning method that effectively balances FDR and power.
We introduce a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks.
Our model outperforms existing benchmarks across synthetic, semi-synthetic, and real-world datasets.
arXiv Detail & Related papers (2024-02-27T03:24:54Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Random quantum circuits anti-concentrate in log depth [118.18170052022323]
We study the number of gates needed for the distribution over measurement outcomes for typical circuit instances to be anti-concentrated.
Our definition of anti-concentration is that the expected collision probability is only a constant factor larger than if the distribution were uniform.
In both the case where the gates are nearest-neighbor on a 1D ring and the case where gates are long-range, we show $O(n log(n)) gates are also sufficient.
arXiv Detail & Related papers (2020-11-24T18:44:57Z) - Deep Direct Likelihood Knockoffs [28.261829940133484]
In scientific domains, the scientist often wishes to discover which features are actually important for making predictions.
Model-X knockoffs enable important features to be discovered with control of the FDR.
We develop Deep Direct Likelihood Knockoffs (DDLK), which directly minimizes the KL divergence implied by the knockoff swap property.
arXiv Detail & Related papers (2020-07-31T04:09:46Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z) - Lower bounds in multiple testing: A framework based on derandomized
proxies [107.69746750639584]
This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
arXiv Detail & Related papers (2020-05-07T19:59:51Z) - Aggregation of Multiple Knockoffs [33.79737923562146]
Aggregation of Multiple Knockoffs (AKO) addresses the instability inherent to the random nature of Knockoff-based inference.
AKO improves both the stability and power compared with the original Knockoff algorithm while still maintaining guarantees for False Discovery Rate control.
We provide a new inference procedure, prove its core properties, and demonstrate its benefits in a set of experiments on synthetic and real datasets.
arXiv Detail & Related papers (2020-02-21T13:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.