Deep Direct Likelihood Knockoffs
- URL: http://arxiv.org/abs/2007.15835v1
- Date: Fri, 31 Jul 2020 04:09:46 GMT
- Title: Deep Direct Likelihood Knockoffs
- Authors: Mukund Sudarshan, Wesley Tansey, Rajesh Ranganath
- Abstract summary: In scientific domains, the scientist often wishes to discover which features are actually important for making predictions.
Model-X knockoffs enable important features to be discovered with control of the FDR.
We develop Deep Direct Likelihood Knockoffs (DDLK), which directly minimizes the KL divergence implied by the knockoff swap property.
- Score: 28.261829940133484
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Predictive modeling often uses black box machine learning methods, such as
deep neural networks, to achieve state-of-the-art performance. In scientific
domains, the scientist often wishes to discover which features are actually
important for making the predictions. These discoveries may lead to costly
follow-up experiments and as such it is important that the error rate on
discoveries is not too high. Model-X knockoffs enable important features to be
discovered with control of the FDR. However, knockoffs require rich generative
models capable of accurately modeling the knockoff features while ensuring they
obey the so-called "swap" property. We develop Deep Direct Likelihood Knockoffs
(DDLK), which directly minimizes the KL divergence implied by the knockoff swap
property. DDLK consists of two stages: it first maximizes the explicit
likelihood of the features, then minimizes the KL divergence between the joint
distribution of features and knockoffs and any swap between them. To ensure
that the generated knockoffs are valid under any possible swap, DDLK uses the
Gumbel-Softmax trick to optimize the knockoff generator under the worst-case
swap. We find DDLK has higher power than baselines while controlling the false
discovery rate on a variety of synthetic and real benchmarks including a task
involving a large dataset from one of the epicenters of COVID-19.
Related papers
- Asymptotic FDR Control with Model-X Knockoffs: Is Moments Matching Sufficient? [6.6716279375012295]
We propose a unified theoretical framework for studying the robustness of the model-X knockoffs framework.
For the first time in the literature, our theoretical results justify formally the effectiveness and inference of the Gaussian knockoffs generator.
arXiv Detail & Related papers (2025-02-09T17:36:00Z) - Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection [14.840211139848275]
"Deep Dependency Regularized Knockoff (DeepDRK)" is a distribution-free deep learning method that effectively balances FDR and power.
We introduce a novel formulation of the knockoff model as a learning problem under multi-source adversarial attacks.
Our model outperforms existing benchmarks across synthetic, semi-synthetic, and real-world datasets.
arXiv Detail & Related papers (2024-02-27T03:24:54Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - ARK: Robust Knockoffs Inference with Coupling [7.288274235236948]
We study the robustness of the model-X knockoffs framework with respect to the misspecified or estimated feature distribution.
A key technique is to couple the approximate knockoffs procedure with the model-X knockoffs procedure so that random variables in these two procedures can be close in realizations.
We prove that if such a coupled model-X knockoffs procedure exists, the approximate knockoffs procedure can achieve the FDR or $k$-FWER control at the target level.
arXiv Detail & Related papers (2023-07-10T08:01:59Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Learning generative models for valid knockoffs using novel
multivariate-rank based statistics [12.528602250193206]
Rank energy (RE) is derived using theoretical results characterizing the optimal maps in the Monge's Optimal Transport (OT) problem.
We propose a variant of the RE, dubbed as soft rank energy (sRE), and its kernel variant called as soft rank maximum mean discrepancy (sRMMD)
We then use sRMMD to generate deep knockoffs and show via extensive evaluation that it is a novel and effective method to produce valid knockoffs.
arXiv Detail & Related papers (2021-10-29T18:51:19Z) - GDP: Stabilized Neural Network Pruning via Gates with Differentiable
Polarization [84.57695474130273]
Gate-based or importance-based pruning methods aim to remove channels whose importance is smallest.
GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel.
Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP achieves the state-of-the-art performance.
arXiv Detail & Related papers (2021-09-06T03:17:10Z) - Simple and Effective Prevention of Mode Collapse in Deep One-Class
Classification [93.2334223970488]
We propose two regularizers to prevent hypersphere collapse in deep SVDD.
The first regularizer is based on injecting random noise via the standard cross-entropy loss.
The second regularizer penalizes the minibatch variance when it becomes too small.
arXiv Detail & Related papers (2020-01-24T03:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.