A Symplectic Analysis of Alternating Mirror Descent
- URL: http://arxiv.org/abs/2405.03472v2
- Date: Tue, 28 May 2024 18:39:53 GMT
- Title: A Symplectic Analysis of Alternating Mirror Descent
- Authors: Jonas Katona, Xiuyuan Wang, Andre Wibisono,
- Abstract summary: We study the discretization of continuous-time Hamiltonian flow via the symplectic Euler method.
We derive new error bounds on the modified Hamiltonian when truncated at orders in the stepsize.
We propose a conjecture which, if true, would imply that the total regret for AMD scales as $mathcalOleft(K-1+varepsilonright)$ and the duality gap of the average iterates as $mathcalOleft(K-1+varepsilonright)$ for any $varepsilon>0$
- Score: 11.117320730408272
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by understanding the behavior of the Alternating Mirror Descent (AMD) algorithm for bilinear zero-sum games, we study the discretization of continuous-time Hamiltonian flow via the symplectic Euler method. We provide a framework for analysis using results from Hamiltonian dynamics, Lie algebra, and symplectic numerical integrators, with an emphasis on the existence and properties of a conserved quantity, the modified Hamiltonian (MH), for the symplectic Euler method. We compute the MH in closed-form when the original Hamiltonian is a quadratic function, and show that it generally differs from the other conserved quantity known previously in that case. We derive new error bounds on the MH when truncated at orders in the stepsize in terms of the number of iterations, $K$, and use these bounds to show an improved $\mathcal{O}(K^{1/5})$ total regret bound and an $\mathcal{O}(K^{-4/5})$ duality gap of the average iterates for AMD. Finally, we propose a conjecture which, if true, would imply that the total regret for AMD scales as $\mathcal{O}\left(K^{\varepsilon}\right)$ and the duality gap of the average iterates as $\mathcal{O}\left(K^{-1+\varepsilon}\right)$ for any $\varepsilon>0$, and we can take $\varepsilon=0$ upon certain convergence conditions for the MH.
Related papers
- Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems.
Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Efficient Sampling on Riemannian Manifolds via Langevin MCMC [51.825900634131486]
We study the task efficiently sampling from a Gibbs distribution $d pi* = eh d vol_g$ over aian manifold $M$ via (geometric) Langevin MCMC.
Our results apply to general settings where $pi*$ can be non exponential and $Mh$ can have negative Ricci curvature.
arXiv Detail & Related papers (2024-02-15T22:59:14Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Unbiased constrained sampling with Self-Concordant Barrier Hamiltonian
Monte Carlo [18.14591309607824]
Barrier Hamiltonian Monte Carlo (BHMC) is a version of the HMC algorithm which aims at sampling from a Gibbs distribution $pi$ on a manifold $mathrmM$.
We propose a new filter step, called "involution checking step", to address this problem.
Our main results establish that these two new algorithms generate reversible Markov chains with respect to $pi$ and do not suffer from any bias in comparison to previous implementations.
arXiv Detail & Related papers (2022-10-21T12:56:07Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Hamiltonian simulation with random inputs [74.82351543483588]
Theory of average-case performance of Hamiltonian simulation with random initial states.
Numerical evidence suggests that this theory accurately characterizes the average error for concrete models.
arXiv Detail & Related papers (2021-11-08T19:08:42Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Mixing Time Guarantees for Unadjusted Hamiltonian Monte Carlo [1.14219428942199]
We provide quantitative upper bounds on the total variation mixing time of the Markov chain corresponding to the unadjusted Hamiltonian Monte Carlo (uHMC) algorithm.
For two general classes of models and fixed time discretization step size $h$, the mixing time is shown to depend only logarithmically on the dimension.
We show that an $varepsilon$-accurate approximation of the target distribution $mu$ in total variation distance can be achieved by uHMC.
arXiv Detail & Related papers (2021-05-03T14:13:47Z) - Estimating 2-Sinkhorn Divergence between Gaussian Processes from
Finite-Dimensional Marginals [4.416484585765028]
We study the convergence of estimating the 2-Sinkhorn divergence between emphGaussian processes (GPs) using their finite-dimensional marginal distributions.
We show almost sure convergence of the divergence when the marginals are sampled according to some base measure.
arXiv Detail & Related papers (2021-02-05T16:17:55Z) - Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics.
In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.