Related papers: Flexible Selective Inference with Flow-based Transport Maps

Flexible Selective Inference with Flow-based Transport Maps

URL: http://arxiv.org/abs/2506.01150v1
Date: Sun, 01 Jun 2025 20:05:20 GMT
Title: Flexible Selective Inference with Flow-based Transport Maps
Authors: Sifan Liu, Snigdha Panigrahi,
Abstract summary: This paper introduces a new method that leverages tools from flow-based generative modeling to approximate a potentially complex conditional distribution.<n>We demonstrate that this method enables flexible selective inference by providing valid p-values and confidence sets for adaptively selected hypotheses and parameters.
Score: 7.197592390105458
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data-carving methods perform selective inference by conditioning the distribution of data on the observed selection event. However, existing data-carving approaches typically require an analytically tractable characterization of the selection event. This paper introduces a new method that leverages tools from flow-based generative modeling to approximate a potentially complex conditional distribution, even when the underlying selection event lacks an analytical description -- take, for example, the data-adaptive tuning of model parameters. The key idea is to learn a transport map that pushes forward a simple reference distribution to the conditional distribution given selection. This map is efficiently learned via a normalizing flow, without imposing any further restrictions on the nature of the selection event. Through extensive numerical experiments on both simulated and real data, we demonstrate that this method enables flexible selective inference by providing: (i) valid p-values and confidence sets for adaptively selected hypotheses and parameters, (ii) a closed-form expression for the conditional density function, enabling likelihood-based and quantile-based inference, and (iii) adjustments for intractable selection steps that can be easily integrated with existing methods designed to account for the tractable steps in a selection procedure involving multiple steps.

Related papers

Transductive Model Selection under Prior Probability Shift [49.56191463229252]
Transductive learning is a supervised machine learning task in which the unlabelled data that require labelling are a finite set and are available at training time.<n>We propose a method, tailored to transductive classification contexts, for performing model selection when the data exhibit prior probability shift.
arXiv Detail & Related papers (2025-07-30T13:03:24Z)
Going from a Representative Agent to Counterfactuals in Combinatorial Choice [2.9172603864294033]
We study decision-making problems where data comprises points from a collection of binary polytopes.<n>We propose a nonparametric approach for counterfactual inference in this setting based on a representative agent model.
arXiv Detail & Related papers (2025-05-29T15:24:23Z)
Optimized Conformal Selection: Powerful Selective Inference After Conformity Score Optimization [4.984656106595651]
This paper presents OptCS, a framework that allows valid statistical testing (selection) after flexible data-driven model optimization.<n>We introduce general conditions under which OptCS constructs valid conformal p-values despite substantial data reuse.<n>We propose three FDR-controlling procedures, each optimizing the models differently.
arXiv Detail & Related papers (2024-11-27T01:40:50Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation On Diverse Modalities [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift.<n>We present a complete and fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment.<n>Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z)
Detecting and Identifying Selection Structure in Sequential Data [53.24493902162797]
We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. We show that selection structure is identifiable without any parametric assumptions or interventional experiments. We also propose a provably correct algorithm to detect and identify selection structures as well as other types of dependencies.
arXiv Detail & Related papers (2024-06-29T20:56:34Z)
Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data [2.6499018693213316]
We introduce a novel generative model for the representation of joint probability distributions of discrete random variables.<n>The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions.
arXiv Detail & Related papers (2024-06-06T21:58:33Z)
Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z)
Bounding Counterfactuals under Selection Bias [60.55840896782637]
We propose a first algorithm to address both identifiable and unidentifiable queries. We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal.
arXiv Detail & Related papers (2022-07-26T10:33:10Z)
Black-box Selective Inference via Bootstrapping [5.960626580825523]
Conditional selective inference requires an exact characterization of the selection event, which is often unavailable except for a few examples like the lasso. This work addresses this challenge by introducing a generic approach to estimate the selection event, facilitating feasible inference conditioned on the selection event.
arXiv Detail & Related papers (2022-03-28T05:18:21Z)
Online Active Model Selection for Pre-trained Classifiers [72.84853880948894]
We design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round. Our algorithm can be used for online prediction tasks for both adversarial and streams.
arXiv Detail & Related papers (2020-10-19T19:53:15Z)
Parametric Programming Approach for More Powerful and General Lasso Selective Inference [25.02674598600182]
Selective Inference (SI) has been actively studied in the past few years for conducting inference on the features of linear models. The main limitation of the original SI approach for Lasso is that the inference is conducted not only conditional on the selected features but also on their signs. We propose a parametric programming-based method that can conduct SI without conditioning on signs even when we have thousands of active features.
arXiv Detail & Related papers (2020-04-21T04:46:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.