A lower confidence sequence for the changing mean of non-negative right
heavy-tailed observations with bounded mean
- URL: http://arxiv.org/abs/2210.11133v1
- Date: Thu, 20 Oct 2022 09:50:05 GMT
- Title: A lower confidence sequence for the changing mean of non-negative right
heavy-tailed observations with bounded mean
- Authors: Paul Mineiro
- Abstract summary: A confidence sequence produces an adapted sequence of sets for a predictable parameter sequence with a time-parametric coverage guarantee.
This work constructs a non-asymptotic lower CS for the running average conditional expectation whose slack converges to zero.
- Score: 9.289846887298854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A confidence sequence (CS) is an anytime-valid sequential inference primitive
which produces an adapted sequence of sets for a predictable parameter sequence
with a time-uniform coverage guarantee. This work constructs a non-parametric
non-asymptotic lower CS for the running average conditional expectation whose
slack converges to zero given non-negative right heavy-tailed observations with
bounded mean. Specifically, when the variance is finite the approach dominates
the empirical Bernstein supermartingale of Howard et. al.; with infinite
variance, can adapt to a known or unknown $(1 + \delta)$-th moment bound; and
can be efficiently approximated using a sublinear number of sufficient
statistics. In certain cases this lower CS can be converted into a
closed-interval CS whose width converges to zero, e.g., any bounded
realization, or post contextual-bandit inference with bounded rewards and
unbounded importance weights. A reference implementation and example
simulations demonstrate the technique.
Related papers
- Understanding Stochastic Natural Gradient Variational Inference [12.800664845601197]
We show a global convergence rate of $mathcalO(frac1)$ implicitly a non-asymptotic convergence rate of NGVI.
The rate is no worse than descent (aka black-box variational inference) and has better constant dependency.
arXiv Detail & Related papers (2024-06-04T00:45:37Z) - Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods [25.831462008050387]
The Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding.
It still remains unclear whether the lastiterate convergence can be provably extended to wider composite optimization and non-Euclidean norms.
arXiv Detail & Related papers (2023-12-13T21:41:06Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - Almost-sure convergence of iterates and multipliers in stochastic
sequential quadratic optimization [21.022322975077653]
Methods for solving continuous optimization problems with equality constraints have attracted attention recently.
convergence guarantees have been limited to the expected value of a gradient to measure zero.
New almost-sure convergence guarantees for the primals, Lagrange measures and station measures generated by a SQP algorithm are proved.
arXiv Detail & Related papers (2023-08-07T16:03:40Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth
Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO.
We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size.
Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Time-uniform central limit theory and asymptotic confidence sequences [34.00292366598841]
Confidence sequences (CS) provide valid inference at arbitrary stopping times and incur no penalties for "peeking" at the data.
CSs are nonasymptotic, enjoying finite-sample guarantees but not the aforementioned broad applicability of confidence intervals.
Asymptotic CSs forgo nonasymptotic validity for CLT-like versatility and (asymptotic) time-uniform guarantees.
arXiv Detail & Related papers (2021-03-11T05:45:35Z) - Convergence Rates of Stochastic Gradient Descent under Infinite Noise
Variance [14.06947898164194]
Heavy tails emerge in gradient descent (SGD) in various scenarios.
We provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance.
Our results indicate that even under heavy-tailed noise with infinite variance, SGD can converge to the global optimum.
arXiv Detail & Related papers (2021-02-20T13:45:11Z) - The Convergence Indicator: Improved and completely characterized
parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge.
Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv Detail & Related papers (2020-06-06T19:08:05Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.