Stability and Generalization for Markov Chain Stochastic Gradient
Methods
- URL: http://arxiv.org/abs/2209.08005v1
- Date: Fri, 16 Sep 2022 15:42:51 GMT
- Title: Stability and Generalization for Markov Chain Stochastic Gradient
Methods
- Authors: Puyu Wang, Yunwen Lei, Yiming Ying, Ding-Xuan Zhou
- Abstract summary: We provide a comprehensive generalization analysis of MC-SGMs for both minimization and minimax problems.
We establish the optimal excess population risk bounds for both smooth and non-smooth cases.
We develop the first nearly optimal convergence rates for convex-concave problems both in expectation and with high probability.
- Score: 49.981789906200035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently there is a large amount of work devoted to the study of Markov chain
stochastic gradient methods (MC-SGMs) which mainly focus on their convergence
analysis for solving minimization problems. In this paper, we provide a
comprehensive generalization analysis of MC-SGMs for both minimization and
minimax problems through the lens of algorithmic stability in the framework of
statistical learning theory. For empirical risk minimization (ERM) problems, we
establish the optimal excess population risk bounds for both smooth and
non-smooth cases by introducing on-average argument stability. For minimax
problems, we develop a quantitative connection between on-average argument
stability and generalization error which extends the existing results for
uniform stability \cite{lei2021stability}. We further develop the first nearly
optimal convergence rates for convex-concave problems both in expectation and
with high probability, which, combined with our stability results, show that
the optimal generalization bounds can be attained for both smooth and
non-smooth cases. To the best of our knowledge, this is the first
generalization analysis of SGMs when the gradients are sampled from a Markov
process.
Related papers
- Multiple Greedy Quasi-Newton Methods for Saddle Point Problems [0.0]
We introduce the Multiple Greedysi-SP (MGSR1-SP) method to solve Hessian point problems.
We show that our method significantly improves both stability and efficiency.
Results affirm MGSR1-SP performance across a broad spectrum of machine learning applications.
arXiv Detail & Related papers (2024-08-01T02:40:37Z) - Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - Stability and Generalization of Stochastic Compositional Gradient
Descent Algorithms [61.59448949684493]
We provide the stability and generalization analysis of compositional descent algorithms built from training examples.
We establish the uniform stability results for two popular compositional gradient descent algorithms, namely SCGD and SCSC.
We derive-independent excess risk bounds for SCGD and SCSC by trade-offing their stability results and optimization errors.
arXiv Detail & Related papers (2023-07-07T02:40:09Z) - On the Global Solution of Soft k-Means [159.23423824953412]
This paper presents an algorithm to solve the Soft k-Means problem globally.
A new model, named Minimal Volume Soft kMeans (MVSkM), is proposed to address solutions non-uniqueness issue.
arXiv Detail & Related papers (2022-12-07T12:06:55Z) - Stability and Generalization of Stochastic Optimization with Nonconvex
and Nonsmooth Problems [34.68590236021379]
This paper introduces systematic algorithmic stability measures and the gap between the quantitative gradients and the population.
We show how we apply these algorithms to achieve an implicit regular iteration step sizes and the adaptive gradient descent.
arXiv Detail & Related papers (2022-06-14T18:14:30Z) - Stability and Generalization of Stochastic Gradient Methods for Minimax
Problems [71.60601421935844]
Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs)
We provide a comprehensive generalization analysis of examples from training gradient methods for minimax problems.
arXiv Detail & Related papers (2021-05-08T22:38:00Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.