Black-box Selective Inference via Bootstrapping
- URL: http://arxiv.org/abs/2203.14504v2
- Date: Sun, 20 Aug 2023 23:09:45 GMT
- Title: Black-box Selective Inference via Bootstrapping
- Authors: Sifan Liu, Jelena Markovic-Voronov, Jonathan Taylor
- Abstract summary: Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso.
This work addresses this challenge by introducing a generic approach to estimate the selection event, facilitating feasible inference conditioned on the selection event.
- Score: 5.960626580825523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditional selective inference requires an exact characterization of the
selection event, which is often unavailable except for a few examples like the
lasso. This work addresses this challenge by introducing a generic approach to
estimate the selection event, facilitating feasible inference conditioned on
the selection event. The method proceeds by repeatedly generating bootstrap
data and running the selection algorithm on the new datasets. Using the outputs
of the selection algorithm, we can estimate the selection probability as a
function of certain summary statistics. This leads to an estimate of the
distribution of the data conditioned on the selection event, which forms the
basis for conditional selective inference. We provide a theoretical guarantee
assuming both asymptotic normality of relevant statistics and accurate
estimation of the selection probability. The applicability of the proposed
method is demonstrated through a variety of problems that lack exact
characterizations of selection, where conditional selective inference was
previously infeasible.
Related papers
- Detecting and Identifying Selection Structure in Sequential Data [53.24493902162797]
We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences.
We show that selection structure is identifiable without any parametric assumptions or interventional experiments.
We also propose a provably correct algorithm to detect and identify selection structures as well as other types of dependencies.
arXiv Detail & Related papers (2024-06-29T20:56:34Z) - Confidence on the Focal: Conformal Prediction with Selection-Conditional Coverage [6.010965256037659]
Conformal prediction builds marginally valid prediction intervals that cover the unknown outcome of a randomly drawn new test point with a prescribed probability.
In such cases, marginally valid conformal prediction intervals may not provide valid coverage for the focal unit(s) due to selection bias.
This paper presents a general framework for constructing a prediction set with finite-sample exact coverage conditional on the unit being selected.
arXiv Detail & Related papers (2024-03-06T17:18:24Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs)
This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias"
We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z) - Selective inference using randomized group lasso estimators for general models [3.4034453928075865]
The method includes the use of exponential family distributions, as well as quasi-likelihood modeling for overdispersed count data.
A randomized group-regularized optimization problem is studied.
Confidence regions for the regression parameters in the selected model take the form of Wald-type regions and are shown to have bounded volume.
arXiv Detail & Related papers (2023-06-24T01:14:26Z) - User-defined Event Sampling and Uncertainty Quantification in Diffusion
Models for Physical Dynamical Systems [49.75149094527068]
We show that diffusion models can be adapted to make predictions and provide uncertainty quantification for chaotic dynamical systems.
We develop a probabilistic approximation scheme for the conditional score function which converges to the true distribution as the noise level decreases.
We are able to sample conditionally on nonlinear userdefined events at inference time, and matches data statistics even when sampling from the tails of the distribution.
arXiv Detail & Related papers (2023-06-13T03:42:03Z) - Bounding Counterfactuals under Selection Bias [60.55840896782637]
We propose a first algorithm to address both identifiable and unidentifiable queries.
We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal.
arXiv Detail & Related papers (2022-07-26T10:33:10Z) - Parameter selection in Gaussian process interpolation: an empirical
study of selection criteria [0.0]
This article revisits the fundamental problem of parameter selection for Gaussian process.
We show that the choice of an appropriate family of models is often more important than the choice of a particular selection criterion.
arXiv Detail & Related papers (2021-07-13T11:57:56Z) - Online Active Model Selection for Pre-trained Classifiers [72.84853880948894]
We design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round.
Our algorithm can be used for online prediction tasks for both adversarial and streams.
arXiv Detail & Related papers (2020-10-19T19:53:15Z) - Parametric Programming Approach for More Powerful and General Lasso
Selective Inference [25.02674598600182]
Selective Inference (SI) has been actively studied in the past few years for conducting inference on the features of linear models.
The main limitation of the original SI approach for Lasso is that the inference is conducted not only conditional on the selected features but also on their signs.
We propose a parametric programming-based method that can conduct SI without conditioning on signs even when we have thousands of active features.
arXiv Detail & Related papers (2020-04-21T04:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.