Related papers: Single-loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions

Single-loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions

URL: http://arxiv.org/abs/2405.18577v2
Date: Thu, 30 May 2024 03:46:44 GMT
Title: Single-loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions
Authors: Quanqi Hu, Qi Qi, Zhaosong Lu, Tianbao Yang,
Abstract summary: We show a class of non-smooth non-asymptotic fairness problems in the form of $min_x[yin Yphi(x, y) - max_zin Zpsix(x, z)]$. We propose an envelope approximate gradient SMAG, the first method for solving these problems, provide a state-of-the-art non-asymptotic convergence rate.
Score: 41.43895948769255
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study a class of non-smooth non-convex problems in the form of $\min_{x}[\max_{y\in Y}\phi(x, y) - \max_{z\in Z}\psi(x, z)]$, where both $\Phi(x) = \max_{y\in Y}\phi(x, y)$ and $\Psi(x)=\max_{z\in Z}\psi(x, z)$ are weakly convex functions, and $\phi(x, y), \psi(x, z)$ are strongly concave functions in terms of $y$ and $z$, respectively. It covers two families of problems that have been studied but are missing single-loop stochastic algorithms, i.e., difference of weakly convex functions and weakly convex strongly-concave min-max problems. We propose a stochastic Moreau envelope approximate gradient method dubbed SMAG, the first single-loop algorithm for solving these problems, and provide a state-of-the-art non-asymptotic convergence rate. The key idea of the design is to compute an approximate gradient of the Moreau envelopes of $\Phi, \Psi$ using only one step of stochastic gradient update of the primal and dual variables. Empirically, we conduct experiments on positive-unlabeled (PU) learning and partial area under ROC curve (pAUC) optimization with an adversarial fairness regularizer to validate the effectiveness of our proposed algorithms.

Related papers

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems. Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z)
Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem [16.9187409976238]
We study a class of convex bilevel optimization problems, also known as simple bilevel optimization. We introduce novel bilevel optimization methods that approximate the solution set of the lower-level problem.
arXiv Detail & Related papers (2023-08-15T02:37:11Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z)
Black-Box Min--Max Continuous Optimization Using CMA-ES with Worst-case Ranking Approximation [22.576922942465142]
A popular approach updates $x$ and $y$ simultaneously or alternatingly. Existing approaches fail if $f$ is not Lipschitz smooth and strongly convex-concave around the optimal solution. We propose minimizing the worst-case objective function $F(x)=max_yf(x,y)$ directly using the covariance matrix adaptation evolution strategy.
arXiv Detail & Related papers (2022-04-06T08:03:39Z)
Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling [84.47780064014262]
We study a linear convex-concave saddle-point problem $min_xmax_y f(x) ytopmathbfA x - g(y)
arXiv Detail & Related papers (2021-12-30T20:31:46Z)
A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth. We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z)
Escaping Saddle Points for Nonsmooth Weakly Convex Functions via Perturbed Proximal Algorithms [0.0]
Main results are based on a novel characterization of $epsilon$-approximate local minimum for nonsmooth functions. We show that under standard assumptions, the perturbed proximal point, perturbed proximal gradient and perturbed proximal linear algorithms find $epsilon$-approximate local minimum for nonsmooth weakly convex functions.
arXiv Detail & Related papers (2021-02-04T19:17:13Z)
The Complexity of Constrained Min-Max Optimization [29.57458485068705]
We show that an approximate local point large enough min-max is guaranteed to exist. More importantly, we show an approximate fixed gradient Descent/Ascent approximation complete. Our result is the first to show an exponential approximation of two fundamental optimization problems.
arXiv Detail & Related papers (2020-09-21T05:54:12Z)
A Unified Single-loop Alternating Gradient Projection Algorithm for Nonconvex-Concave and Convex-Nonconcave Minimax Problems [8.797831153231664]
We develop an efficient algorithm for solving minimax problems with theoretical general convexnon objective guarantees. We show that the proposed algorithm can be used to solve both noncaveepsilon concave (strongly) and (strongly) convexnonconcave minimax problems.
arXiv Detail & Related papers (2020-06-03T04:00:52Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions. In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.