Provably Efficient Convergence of Primal-Dual Actor-Critic with
Nonlinear Function Approximation
- URL: http://arxiv.org/abs/2202.13863v1
- Date: Mon, 28 Feb 2022 15:16:23 GMT
- Title: Provably Efficient Convergence of Primal-Dual Actor-Critic with
Nonlinear Function Approximation
- Authors: Jing Dong, Li Shen, Yinggan Xu, Baoxiang Wang
- Abstract summary: We show the first efficient convergence result with primal-dual actor-critic with a convergence of $mathcalOleft ascent(Nright)Nright)$ under Polyian sampling.
Results on Open GymAI continuous control tasks.
- Score: 15.319335698574932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the convergence of the actor-critic algorithm with nonlinear
function approximation under a nonconvex-nonconcave primal-dual formulation.
Stochastic gradient descent ascent is applied with an adaptive proximal term
for robust learning rates. We show the first efficient convergence result with
primal-dual actor-critic with a convergence rate of
$\mathcal{O}\left(\sqrt{\frac{\ln \left(N d G^2 \right)}{N}}\right)$ under
Markovian sampling, where $G$ is the element-wise maximum of the gradient, $N$
is the number of iterations, and $d$ is the dimension of the gradient. Our
result is presented with only the Polyak-\L{}ojasiewicz condition for the dual
variables, which is easy to verify and applicable to a wide range of
reinforcement learning (RL) scenarios. The algorithm and analysis are general
enough to be applied to other RL settings, like multi-agent RL. Empirical
results on OpenAI Gym continuous control tasks corroborate our theoretical
findings.
Related papers
- Convergence Analysis of Adaptive Gradient Methods under Refined Smoothness and Noise Assumptions [18.47705532817026]
Adaptive gradient methods are arguably the most successful optimization algorithms for neural network.
We show that adaptive gradient methods can potentially shave a factor Adad-ell/ell$ geometry.
arXiv Detail & Related papers (2024-06-07T02:55:57Z) - On the Convergence of Multi-objective Optimization under Generalized Smoothness [27.87166415148172]
We study a more general and realistic class of $ell$-smooth loss functions, where $ell$ is a general non-decreasing function norm.
We develop two novel algorithms for $ell$-smooth Generalized Multi-MOO GradientGrad and its variant, Generalized Smooth Multi-MOO descent.
Our algorithms can also guarantee a tighter $mathcalO(epsilon-2)$ in each iteration using more samples.
arXiv Detail & Related papers (2024-05-29T18:36:59Z) - Analysis of the expected $L_2$ error of an over-parametrized deep neural
network estimate learned by gradient descent without regularization [7.977229957867868]
Recent results show that estimates defined by over-parametrized deep neural networks learned by applying gradient descent to a regularized empirical $L$ risk are universally consistent.
In this paper, we show that the regularization term is not necessary to obtain similar results.
arXiv Detail & Related papers (2023-11-24T17:04:21Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Convergence of Adam Under Relaxed Assumptions [72.24779199744954]
We show that Adam converges to $epsilon$-stationary points with $O(epsilon-4)$ gradient complexity under far more realistic conditions.
We also propose a variance-reduced version of Adam with an accelerated gradient complexity of $O(epsilon-3)$.
arXiv Detail & Related papers (2023-04-27T06:27:37Z) - Restricted Strong Convexity of Deep Learning Models with Smooth
Activations [31.003601717265006]
We study the problem of optimization of deep learning models with smooth activation functions.
We introduce a new analysis of optimization based on Restricted Strong Convexity (RSC)
Ours is the first result on establishing geometric convergence of GD based on RSC for deep learning models.
arXiv Detail & Related papers (2022-09-29T21:24:26Z) - A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth.
We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z) - Randomized Exploration for Reinforcement Learning with General Value
Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm.
Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.
We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z) - Reinforcement Learning with General Value Function Approximation:
Provably Efficient Approach via Bounded Eluder Dimension [124.7752517531109]
We establish a provably efficient reinforcement learning algorithm with general value function approximation.
We show that our algorithm achieves a regret bound of $widetildeO(mathrmpoly(dH)sqrtT)$ where $d$ is a complexity measure.
Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment.
arXiv Detail & Related papers (2020-05-21T17:36:09Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.