High Confidence Level Inference is Almost Free using Parallel Stochastic
Optimization
- URL: http://arxiv.org/abs/2401.09346v1
- Date: Wed, 17 Jan 2024 17:11:45 GMT
- Title: High Confidence Level Inference is Almost Free using Parallel Stochastic
Optimization
- Authors: Wanrong Zhu, Zhipeng Lou, Ziyang Wei, Wei Biao Wu
- Abstract summary: This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level.
Our method requires minimal additional computation and memory beyond the standard updating of estimates, making the inference process almost cost-free.
- Score: 16.38026811561888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Uncertainty quantification for estimation through stochastic optimization
solutions in an online setting has gained popularity recently. This paper
introduces a novel inference method focused on constructing confidence
intervals with efficient computation and fast convergence to the nominal level.
Specifically, we propose to use a small number of independent multi-runs to
acquire distribution information and construct a t-based confidence interval.
Our method requires minimal additional computation and memory beyond the
standard updating of estimates, making the inference process almost cost-free.
We provide a rigorous theoretical guarantee for the confidence interval,
demonstrating that the coverage is approximately exact with an explicit
convergence rate and allowing for high confidence level inference. In
particular, a new Gaussian approximation result is developed for the online
estimators to characterize the coverage properties of our confidence intervals
in terms of relative errors. Additionally, our method also allows for
leveraging parallel computing to further accelerate calculations using multiple
cores. It is easy to implement and can be integrated with existing stochastic
algorithms without the need for complicated modifications.
Related papers
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Optimal Learning via Moderate Deviations Theory [4.6930976245638245]
We develop a systematic construction of highly accurate confidence intervals by using a moderate deviation principle-based approach.
It is shown that the proposed confidence intervals are statistically optimal in the sense that they satisfy criteria regarding exponential accuracy, minimality, consistency, mischaracterization probability, and eventual uniformly most accurate (UMA) property.
arXiv Detail & Related papers (2023-05-23T19:57:57Z) - Communication-Efficient Distributed Estimation and Inference for Cox's Model [4.731404257629232]
We develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model.
To construct confidence intervals for linear combinations of high-dimensional hazard regression coefficients, we introduce a novel debiased method.
We provide valid and powerful distributed hypothesis tests for any coordinate element based on a decorrelated score test.
arXiv Detail & Related papers (2023-02-23T15:50:17Z) - Fast Entropy-Based Methods of Word-Level Confidence Estimation for
End-To-End Automatic Speech Recognition [86.21889574126878]
We show how per-frame entropy values can be normalized and aggregated to obtain a confidence measure per unit and per word.
We evaluate the proposed confidence measures on LibriSpeech test sets, and show that they are up to 2 and 4 times better than confidence estimation based on the maximum per-frame probability.
arXiv Detail & Related papers (2022-12-16T20:27:40Z) - PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences.
We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction.
Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z) - CoinDICE: Off-Policy Confidence Interval Estimation [107.86876722777535]
We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning.
We show in a variety of benchmarks that the confidence interval estimates are tighter and more accurate than existing methods.
arXiv Detail & Related papers (2020-10-22T12:39:11Z) - Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the
Predictive Uncertainties [12.068153197381575]
We propose a novel variational family that allows for retaining covariances between latent processes while achieving fast convergence.
We provide an efficient implementation of our new approach and apply it to several benchmark datasets.
It yields excellent results and strikes a better balance between accuracy and calibrated uncertainty estimates than its state-of-the-art alternatives.
arXiv Detail & Related papers (2020-05-22T11:10:59Z) - Optimal Change-Point Detection with Training Sequences in the Large and
Moderate Deviations Regimes [72.68201611113673]
This paper investigates a novel offline change-point detection problem from an information-theoretic perspective.
We assume that the knowledge of the underlying pre- and post-change distributions are not known and can only be learned from the training sequences which are available.
arXiv Detail & Related papers (2020-03-13T23:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.