Related papers: A distribution-free valid p-value for finite samples of bounded random variables

A distribution-free valid p-value for finite samples of bounded random variables

URL: http://arxiv.org/abs/2405.08975v1
Date: Tue, 14 May 2024 22:01:04 GMT
Title: A distribution-free valid p-value for finite samples of bounded random variables
Authors: Joaquin Alvarez,
Abstract summary: We build a valid p-value based on a concentration inequality for bounded random variables introduced by Pelekis, Ramon and Wang. The motivation behind this work is the calibration of predictive algorithms in a distribution-free setting. The ideas presented in this work are also relevant in classical statistical inference.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We build a valid p-value based on a concentration inequality for bounded random variables introduced by Pelekis, Ramon and Wang. The motivation behind this work is the calibration of predictive algorithms in a distribution-free setting. The super-uniform p-value is tighter than Hoeffding and Bentkus alternatives in certain regions. Even though we are motivated by a calibration setting in a machine learning context, the ideas presented in this work are also relevant in classical statistical inference. Furthermore, we compare the power of a collection of valid p- values for bounded losses, which are presented in previous literature.

Related papers

Learning Parametric Distributions from Samples and Preferences [19.879505582147807]
We show that preference-based M-estimators achieve a better variance than sample-only M-estimators.<n>We propose an estimator achieving an estimation error scaling of $mathcalO (1/n)$ -- a significant improvement over the $Theta (1/sqrtn)$ rate attainable with samples alone.
arXiv Detail & Related papers (2025-05-29T15:33:43Z)
Constrained Sampling with Primal-Dual Langevin Monte Carlo [15.634831573546041]
This work considers the problem of sampling from a probability distribution known up to a normalization constant. It satisfies a set of statistical constraints specified by the expected values of general nonlinear functions. We put forward a discrete-time primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it.
arXiv Detail & Related papers (2024-11-01T13:26:13Z)
NETS: A Non-Equilibrium Transport Sampler [15.58993313831079]
We propose an algorithm, termed the Non-Equilibrium Transport Sampler (NETS) NETS can be viewed as a variant of importance sampling (AIS) based on Jarzynski's equality. We show that this drift is the minimizer of a variety of objective functions, which can all be estimated in an unbiased fashion.
arXiv Detail & Related papers (2024-10-03T17:35:38Z)
Policy Gradient with Active Importance Sampling [55.112959067035916]
Policy gradient (PG) methods significantly benefit from IS, enabling the effective reuse of previously collected samples. However, IS is employed in RL as a passive tool for re-weighting historical samples. We look for the best behavioral policy from which to collect samples to reduce the policy gradient variance.
arXiv Detail & Related papers (2024-05-09T09:08:09Z)
Gaussian boson sampling validation via detector binning [0.0]
We propose binned-detector probability distributions as a suitable quantity to statistically validate GBS experiments. We show how to compute such distributions by leveraging their connection with their respective characteristic function. We also illustrate how binned-detector probability distributions behave when Haar-averaged over all possible interferometric networks.
arXiv Detail & Related papers (2023-10-27T12:55:52Z)
Comparing two samples through stochastic dominance: a graphical approach [2.867517731896504]
Non-deterministic measurements are common in real-world scenarios. We propose an alternative framework to visually compare two samples according to their estimated cumulative distribution functions.
arXiv Detail & Related papers (2022-03-15T13:37:03Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Variance Minimization in the Wasserstein Space for Invariant Causal Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors. Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory. We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z)
Deconfounding Scores: Feature Representations for Causal Effect Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation. We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data. In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z)
Optimal Off-Policy Evaluation from Multiple Logging Policies [77.62012545592233]
We study off-policy evaluation from multiple logging policies, each generating a dataset of fixed size, i.e., stratified sampling. We find the OPE estimator for multiple loggers with minimum variance for any instance, i.e., the efficient one.
arXiv Detail & Related papers (2020-10-21T13:43:48Z)
A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution. We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.