A lower confidence sequence for the changing mean of non-negative right
  heavy-tailed observations with bounded mean
        - URL: http://arxiv.org/abs/2210.11133v1
 - Date: Thu, 20 Oct 2022 09:50:05 GMT
 - Title: A lower confidence sequence for the changing mean of non-negative right
  heavy-tailed observations with bounded mean
 - Authors: Paul Mineiro
 - Abstract summary: A confidence sequence produces an adapted sequence of sets for a predictable parameter sequence with a time-parametric coverage guarantee.
This work constructs a non-asymptotic lower CS for the running average conditional expectation whose slack converges to zero.
 - Score: 9.289846887298854
 - License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
 - Abstract:   A confidence sequence (CS) is an anytime-valid sequential inference primitive
which produces an adapted sequence of sets for a predictable parameter sequence
with a time-uniform coverage guarantee. This work constructs a non-parametric
non-asymptotic lower CS for the running average conditional expectation whose
slack converges to zero given non-negative right heavy-tailed observations with
bounded mean. Specifically, when the variance is finite the approach dominates
the empirical Bernstein supermartingale of Howard et. al.; with infinite
variance, can adapt to a known or unknown $(1 + \delta)$-th moment bound; and
can be efficiently approximated using a sublinear number of sufficient
statistics. In certain cases this lower CS can be converted into a
closed-interval CS whose width converges to zero, e.g., any bounded
realization, or post contextual-bandit inference with bounded rewards and
unbounded importance weights. A reference implementation and example
simulations demonstrate the technique.
 
       
      
        Related papers
        - Online Covariance Estimation in Nonsmooth Stochastic Approximation [14.818683408659764]
We consider applying approximation (SA) methods to solve nonsmooth variational inclusion problems.
Our convergence construction establish the best-known for statistical estimation methods.
arXiv  Detail & Related papers  (2025-02-07T20:16:51Z) - Understanding Stochastic Natural Gradient Variational Inference [12.800664845601197]
We show a global convergence rate of $mathcalO(frac1)$ implicitly a non-asymptotic convergence rate of NGVI.
The rate is no worse than descent (aka black-box variational inference) and has better constant dependency.
arXiv  Detail & Related papers  (2024-06-04T00:45:37Z) - Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods [25.831462008050387]
The Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding.
It still remains unclear whether the lastiterate convergence can be provably extended to wider composite optimization and non-Euclidean norms.
arXiv  Detail & Related papers  (2023-12-13T21:41:06Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization   Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv  Detail & Related papers  (2023-11-07T17:39:17Z) - Almost-sure convergence of iterates and multipliers in stochastic
  sequential quadratic optimization [21.022322975077653]
Methods for solving continuous optimization problems with equality constraints have attracted attention recently.
 convergence guarantees have been limited to the expected value of a gradient to measure zero.
New almost-sure convergence guarantees for the primals, Lagrange measures and station measures generated by a SQP algorithm are proved.
arXiv  Detail & Related papers  (2023-08-07T16:03:40Z) - High-Probability Bounds for Stochastic Optimization and Variational
  Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv  Detail & Related papers  (2023-02-02T10:37:23Z) - Kernel-based off-policy estimation without overlap: Instance optimality
  beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv  Detail & Related papers  (2023-01-16T02:57:37Z) - Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth
  Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO.
We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size.
Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv  Detail & Related papers  (2021-06-30T18:32:46Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
  Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv  Detail & Related papers  (2021-06-30T17:51:36Z) - Time-uniform central limit theory and asymptotic confidence sequences [34.00292366598841]
Confidence sequences (CS) provide valid inference at arbitrary stopping times and incur no penalties for "peeking" at the data.
CSs are nonasymptotic, enjoying finite-sample guarantees but not the aforementioned broad applicability of confidence intervals.
Asymptotic CSs forgo nonasymptotic validity for CLT-like versatility and (asymptotic) time-uniform guarantees.
arXiv  Detail & Related papers  (2021-03-11T05:45:35Z) - Convergence Rates of Stochastic Gradient Descent under Infinite Noise
  Variance [14.06947898164194]
Heavy tails emerge in gradient descent (SGD) in various scenarios.
We provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance.
Our results indicate that even under heavy-tailed noise with infinite variance, SGD can converge to the global optimum.
arXiv  Detail & Related papers  (2021-02-20T13:45:11Z) - The Convergence Indicator: Improved and completely characterized
  parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge.
Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv  Detail & Related papers  (2020-06-06T19:08:05Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
  Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv  Detail & Related papers  (2020-04-09T17:54:18Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.