What is a Good Metric to Study Generalization of Minimax Learners?
- URL: http://arxiv.org/abs/2206.04502v1
- Date: Thu, 9 Jun 2022 13:39:06 GMT
- Title: What is a Good Metric to Study Generalization of Minimax Learners?
- Authors: Asuman Ozdaglar, Sarath Pattathil, Jiawei Zhang, Kaiqing Zhang
- Abstract summary: Minimax optimization has served as backbone of many machine learning (ML) problems.
How the solution trained on data performs on metric testing has been relatively underexplored.
We propose a new metric generalization minimax learners: the primal, to answer these issues.
- Score: 24.577243536475233
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Minimax optimization has served as the backbone of many machine learning (ML)
problems. Although the convergence behavior of optimization algorithms has been
extensively studied in minimax settings, their generalization guarantees in the
stochastic setting, i.e., how the solution trained on empirical data performs
on the unseen testing data, have been relatively underexplored. A fundamental
question remains elusive: What is a good metric to study generalization of
minimax learners? In this paper, we aim to answer this question by first
showing that primal risk, a universal metric to study generalization in
minimization, fails in simple examples of minimax problems. Furthermore,
another popular metric, the primal-dual risk, also fails to characterize the
generalization behavior for minimax problems with nonconvexity, due to
non-existence of saddle points. We thus propose a new metric to study
generalization of minimax learners: the primal gap, to circumvent these issues.
Next, we derive generalization bounds for the primal gap in nonconvex-concave
settings. As byproducts of our analysis, we also solve two open questions:
establishing generalization bounds for primal risk and primal-dual risk in the
strong sense, i.e., without strong concavity or assuming that the maximization
and expectation can be interchanged, while either of these assumptions was
needed in the literature. Finally, we leverage this new metric to compare the
generalization behavior of two popular algorithms -- gradient descent-ascent
(GDA) and gradient descent-max (GDMax) in stochastic minimax optimization.
Related papers
- Towards Sharper Risk Bounds for Minimax Problems [23.380477456114118]
Minimax problems have achieved success in machine learning such as adversarial, robust optimization, reinforcement learning.
For theoretical analysis, current optimal excess risk bounds are composed by generalization error and present 1/n-rates in strongly-strongly-concave (SC-SC)
We analyze some popular algorithms such as empirical saddle point (GDA), gradient descent (DA) and gradient descent (SG)
We derive n times faster than results in minimax problems.
arXiv Detail & Related papers (2024-10-11T03:50:23Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Escaping limit cycles: Global convergence for constrained
nonconvex-nonconcave minimax problems [46.71914490521821]
This paper introduces a new extragradient-type algorithm for a class of non-nonconcave minimax problems.
The proposed algorithm is applicable to constrained and regularized problems.
arXiv Detail & Related papers (2023-02-20T08:37:08Z) - Minimax Optimization: The Case of Convex-Submodular [50.03984152441271]
Minimax problems extend beyond the continuous domain to mixed continuous-discrete domains or even fully discrete domains.
We introduce the class of convex-submodular minimax problems, where the objective is convex with respect to the continuous variable and submodular with respect to the discrete variable.
Our proposed algorithms are iterative and combine tools from both discrete and continuous optimization.
arXiv Detail & Related papers (2021-11-01T21:06:35Z) - Stability and Generalization of Stochastic Gradient Methods for Minimax
Problems [71.60601421935844]
Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs)
We provide a comprehensive generalization analysis of examples from training gradient methods for minimax problems.
arXiv Detail & Related papers (2021-05-08T22:38:00Z) - Train simultaneously, generalize better: Stability of gradient-based
minimax learners [12.691047660244331]
We show a key role in the performance of the trained minimax model under both convex concave and non-concave minimax settings.
We discuss several numerical results indicating the role of optimization algorithms in the generalization of learned minimax models.
arXiv Detail & Related papers (2020-10-23T17:44:43Z) - Learning Minimax Estimators via Online Learning [55.92459567732491]
We consider the problem of designing minimax estimators for estimating parameters of a probability distribution.
We construct an algorithm for finding a mixed-case Nash equilibrium.
arXiv Detail & Related papers (2020-06-19T22:49:42Z) - Non-convex Min-Max Optimization: Applications, Challenges, and Recent
Theoretical Advances [58.54078318403909]
The min-max problem, also known as the saddle point problem, is a class adversarial problem which is also studied in the context ofsum games.
arXiv Detail & Related papers (2020-06-15T05:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.