A Note on High-Probability Analysis of Algorithms with Exponential,
Sub-Gaussian, and General Light Tails
- URL: http://arxiv.org/abs/2403.02873v1
- Date: Tue, 5 Mar 2024 11:38:20 GMT
- Title: A Note on High-Probability Analysis of Algorithms with Exponential,
Sub-Gaussian, and General Light Tails
- Authors: Amit Attia, Tomer Koren
- Abstract summary: We show that the analysis of such an algorithm can be reduced, in a black-box manner, and with only a small loss in logarithmic factors.
This approach simultaneously applies to any light-tailed randomization, including exponential, sub-Gaussian, and more general fast-decaying distributions.
- Score: 34.465259431606405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This short note describes a simple technique for analyzing probabilistic
algorithms that rely on a light-tailed (but not necessarily bounded) source of
randomization. We show that the analysis of such an algorithm can be reduced,
in a black-box manner and with only a small loss in logarithmic factors, to an
analysis of a simpler variant of the same algorithm that uses bounded random
variables and often easier to analyze. This approach simultaneously applies to
any light-tailed randomization, including exponential, sub-Gaussian, and more
general fast-decaying distributions, without needing to appeal to specialized
concentration inequalities. Analyses of a generalized Azuma inequality and
stochastic optimization with general light-tailed noise are provided to
illustrate the technique.
Related papers
- Understanding the Generalization Error of Markov algorithms through Poissonization [25.946717717350047]
We introduce a generic framework for analyzing the generalization error of Markov algorithms.
We first develop a novel entropy flow, which directly leads to PAC-Bayesian generalization bounds.
We then draw novel links to modified versions of the celebrated logarithmic Sobolev inequalities (LSI)
arXiv Detail & Related papers (2025-02-11T14:31:32Z) - Bregman-divergence-based Arimoto-Blahut algorithm [53.64687146666141]
We generalize the Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system.
We propose a convex-optimization-free algorithm that can be applied to classical and quantum rate-distortion theory.
arXiv Detail & Related papers (2024-08-10T06:16:24Z) - Fast convergence of the Expectation Maximization algorithm under a logarithmic Sobolev inequality [1.1510009152620668]
We extend an analysis technique commonly employed to understand alternating minimization algorithms on Euclidean space to the Expectation Maximization (EM) algorithm.
We obtain finite sample error bounds and exponential convergence of the EM algorithm under a natural generalisation of a log-Sobolev inequality.
arXiv Detail & Related papers (2024-07-25T11:08:53Z) - A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - A general error analysis for randomized low-rank approximation with application to data assimilation [42.57210316104905]
We propose a framework for the analysis of the low-rank approximation error in Frobenius norm for centered and non-standard matrices.
Under minimal assumptions, we derive accurate bounds in expectation and probability.
Our bounds have clear interpretations that enable us to derive properties and motivate practical choices.
arXiv Detail & Related papers (2024-05-08T04:51:56Z) - Connection between single-layer Quantum Approximate Optimization
Algorithm interferometry and thermal distributions sampling [0.0]
We extend the theoretical derivation of the amplitudes of the eigenstates, and the Boltzmann distributions generated by single-layer QAOA.
We also review the implications that this behavior has from both a practical and fundamental perspective.
arXiv Detail & Related papers (2023-10-13T15:06:58Z) - Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem.
We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z) - Quantization-Based Optimization: Alternative Stochastic Approximation of
Global Optimization [0.0]
We propose a global optimization algorithm based on quantizing the energy level of an objective function in an NP-hard problem.
Numerical experiments show that the proposed algorithm outperforms conventional learning methods in solving NP-hard optimization problems.
arXiv Detail & Related papers (2022-11-08T03:01:45Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Understanding the Generalization Performance of Spectral Clustering
Algorithms [11.025579607812167]
We study the excess risk bounds of the popular spectral clustering algorithms: emphrelaxed RatioCut and emphrelaxed NCut.
We propose two novel algorithms that can not only penalize this quantity, but also cluster the out-of-sample data without re-eigendecomposition on the overall sample.
arXiv Detail & Related papers (2022-04-30T14:21:56Z) - First-Order Algorithms for Nonlinear Generalized Nash Equilibrium
Problems [88.58409977434269]
We consider the problem of computing an equilibrium in a class of nonlinear generalized Nash equilibrium problems (NGNEPs)
Our contribution is to provide two simple first-order algorithmic frameworks based on the quadratic penalty method and the augmented Lagrangian method.
We provide nonasymptotic theoretical guarantees for these algorithms.
arXiv Detail & Related papers (2022-04-07T00:11:05Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - A Theoretical Analysis on Independence-driven Importance Weighting for
Covariate-shift Generalization [44.88645911638269]
independence-driven importance algorithms in stable learning literature have shown empirical effectiveness.
In this paper, we theoretically prove the effectiveness of such algorithms by explaining them as feature selection processes.
We prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set.
arXiv Detail & Related papers (2021-11-03T17:18:49Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Halting Time is Predictable for Large Models: A Universality Property
and Average-case Analysis [14.863027146736696]
Compared to worst-case analysis, average-case analysis is more representative of the typical behavior of an algorithm.
We show that this is not the case for a class of large-scale problems trained with first-order methods.
numerical simulations suggest this universality property holds for a more general class of algorithms and problems.
arXiv Detail & Related papers (2020-06-08T00:47:24Z) - Second-Order Guarantees in Centralized, Federated and Decentralized
Nonconvex Optimization [64.26238893241322]
Simple algorithms have been shown to lead to good empirical results in many contexts.
Several works have pursued rigorous analytical justification for studying non optimization problems.
A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
arXiv Detail & Related papers (2020-03-31T16:54:22Z) - Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic
Perspectives [97.16266088683061]
The article rigorously establishes why symplectic discretization schemes are important for momentum-based optimization algorithms.
It provides a characterization of algorithms that exhibit accelerated convergence.
arXiv Detail & Related papers (2020-02-28T00:32:47Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.