Related papers: A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean

A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean

URL: http://arxiv.org/abs/2210.11133v1
Date: Thu, 20 Oct 2022 09:50:05 GMT
Title: A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean
Authors: Paul Mineiro
Abstract summary: A confidence sequence produces an adapted sequence of sets for a predictable parameter sequence with a time-parametric coverage guarantee. This work constructs a non-asymptotic lower CS for the running average conditional expectation whose slack converges to zero.
Score: 9.289846887298854
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A confidence sequence (CS) is an anytime-valid sequential inference primitive which produces an adapted sequence of sets for a predictable parameter sequence with a time-uniform coverage guarantee. This work constructs a non-parametric non-asymptotic lower CS for the running average conditional expectation whose slack converges to zero given non-negative right heavy-tailed observations with bounded mean. Specifically, when the variance is finite the approach dominates the empirical Bernstein supermartingale of Howard et. al.; with infinite variance, can adapt to a known or unknown $(1 + \delta)$-th moment bound; and can be efficiently approximated using a sublinear number of sufficient statistics. In certain cases this lower CS can be converted into a closed-interval CS whose width converges to zero, e.g., any bounded realization, or post contextual-bandit inference with bounded rewards and unbounded importance weights. A reference implementation and example simulations demonstrate the technique.

Related papers

Online Covariance Estimation in Nonsmooth Stochastic Approximation [14.818683408659764]
We consider applying approximation (SA) methods to solve nonsmooth variational inclusion problems. Our convergence construction establish the best-known for statistical estimation methods.
arXiv Detail & Related papers (2025-02-07T20:16:51Z)
Understanding Stochastic Natural Gradient Variational Inference [12.800664845601197]
We show a global convergence rate of $mathcalO(frac1)$ implicitly a non-asymptotic convergence rate of NGVI. The rate is no worse than descent (aka black-box variational inference) and has better constant dependency.
arXiv Detail & Related papers (2024-06-04T00:45:37Z)
Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods [25.831462008050387]
The Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding. It still remains unclear whether the lastiterate convergence can be provably extended to wider composite optimization and non-Euclidean norms.
arXiv Detail & Related papers (2023-12-13T21:41:06Z)
Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density. We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order. We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z)
Almost-sure convergence of iterates and multipliers in stochastic sequential quadratic optimization [21.022322975077653]
Methods for solving continuous optimization problems with equality constraints have attracted attention recently. convergence guarantees have been limited to the expected value of a gradient to measure zero. New almost-sure convergence guarantees for the primals, Lagrange measures and station measures generated by a SQP algorithm are proved.
arXiv Detail & Related papers (2023-08-07T16:03:40Z)
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z)
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data. For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence. We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z)
Time-uniform central limit theory and asymptotic confidence sequences [34.00292366598841]
Confidence sequences (CS) provide valid inference at arbitrary stopping times and incur no penalties for "peeking" at the data. CSs are nonasymptotic, enjoying finite-sample guarantees but not the aforementioned broad applicability of confidence intervals. Asymptotic CSs forgo nonasymptotic validity for CLT-like versatility and (asymptotic) time-uniform guarantees.
arXiv Detail & Related papers (2021-03-11T05:45:35Z)
Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance [14.06947898164194]
Heavy tails emerge in gradient descent (SGD) in various scenarios. We provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance. Our results indicate that even under heavy-tailed noise with infinite variance, SGD can converge to the global optimum.
arXiv Detail & Related papers (2021-02-20T13:45:11Z)
The Convergence Indicator: Improved and completely characterized parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge. Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv Detail & Related papers (2020-06-06T19:08:05Z)
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging. We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.