Related papers: Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm

Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm

URL: http://arxiv.org/abs/2206.00833v1
Date: Thu, 2 Jun 2022 02:13:29 GMT
Title: Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm
Authors: Semih Cayci, Niao He, R. Srikant
Abstract summary: We present a finite-time analysis of Natural actor-critic (NAC) with neural network approximation. We identify the roles of neural networks, regularization and optimization techniques to achieve provably good performance.
Score: 29.978816372127085
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Natural actor-critic (NAC) and its variants, equipped with the representation power of neural networks, have demonstrated impressive empirical success in solving Markov decision problems with large state spaces. In this paper, we present a finite-time analysis of NAC with neural network approximation, and identify the roles of neural networks, regularization and optimization techniques (e.g., gradient clipping and averaging) to achieve provably good performance in terms of sample complexity, iteration complexity and overparametrization bounds for the actor and the critic. In particular, we prove that (i) entropy regularization and averaging ensure stability by providing sufficient exploration to avoid near-deterministic and strictly suboptimal policies and (ii) regularization leads to sharp sample complexity and network width bounds in the regularized MDPs, yielding a favorable bias-variance tradeoff in policy optimization. In the process, we identify the importance of uniform approximation power of the actor neural network to achieve global optimality in policy optimization due to distributional shift.

Related papers

Topological Adaptive Least Mean Squares Algorithms over Simplicial Complexes [13.291627429657416]
This paper introduces a novel adaptive framework for processing dynamic flow signals over simplicial complexes.<n>We present a topological LMS algorithm that efficiently processes streaming signals observed over time-varying edge subsets.
arXiv Detail & Related papers (2025-05-29T06:55:19Z)
Stochastic Optimization with Optimal Importance Sampling [49.484190237840714]
We propose an iterative-based algorithm that jointly updates the decision and the IS distribution without requiring time-scale separation between the two. Our method achieves the lowest possible variable variance and guarantees global convergence under convexity of the objective and mild assumptions on the IS distribution family.
arXiv Detail & Related papers (2025-04-04T16:10:18Z)
Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality [52.906438147288256]
We show that our algorithm can identify the globally optimal reward and policy under certain neural network structures. This is the first IRL algorithm with a non-asymptotic convergence guarantee that provably achieves global optimality.
arXiv Detail & Related papers (2025-03-22T21:16:08Z)
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization [38.32265770020665]
We study a natural actor-critic algorithm that utilizes neural networks to represent the critic. Our aim is to establish sample complexity guarantees for this algorithm, achieving a deeper understanding of its performance characteristics.
arXiv Detail & Related papers (2023-06-18T06:22:04Z)
Neural Characteristic Activation Analysis and Geometric Parameterization for ReLU Networks [2.2713084727838115]
We introduce a novel approach for analyzing the training dynamics of ReLU networks by examining the characteristic activation boundaries of individual neurons. Our proposed analysis reveals a critical instability in common neural network parameterizations and normalizations during convergence optimization, which impedes fast convergence and hurts performance.
arXiv Detail & Related papers (2023-05-25T10:19:13Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability. We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z)
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks. We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution. We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z)
Iterative Surrogate Model Optimization (ISMO): An active learning algorithm for PDE constrained optimization with deep neural networks [14.380314061763508]
We present a novel active learning algorithm, termed as iterative surrogate model optimization (ISMO) This algorithm is based on deep neural networks and its key feature is the iterative selection of training data through a feedback loop between deep neural networks and any underlying standard optimization algorithm.
arXiv Detail & Related papers (2020-08-13T07:31:07Z)
The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural Networks: an Exact Characterization of the Optimal Solutions [51.60996023961886]
We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints. Our analysis is novel, characterizes all optimal solutions, and does not leverage duality-based analysis which was recently used to lift neural network training into convex spaces.
arXiv Detail & Related papers (2020-06-10T15:38:30Z)
Stochastic batch size for adaptive regularization in deep network optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework. We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy [119.12515258771302]
We show that a variant of PPOO equipped with over-parametrization converges to globally optimal networks. The key to our analysis is the iterate of infinite gradient under a notion of one-dimensional monotonicity, where the gradient and are instant by networks.
arXiv Detail & Related papers (2019-06-25T03:20:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.