On the Global Convergence of Fitted Q-Iteration with Two-layer Neural
Network Parametrization
- URL: http://arxiv.org/abs/2211.07675v1
- Date: Mon, 14 Nov 2022 19:00:24 GMT
- Title: On the Global Convergence of Fitted Q-Iteration with Two-layer Neural
Network Parametrization
- Authors: Mudit Gaur, Vaneet Aggarwal, Mridul Aggarwal
- Abstract summary: We study a Fitted Q-Iteration with two-layer ReLU neural network parametrization, and find the sample complexity guarantees for the algorithm.
We show that this approach achieves a sample complexity of $tildemathcalO (1/epsilon2)$, which is order-optimal.
- Score: 33.12181620473604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Q-learning based algorithms have been applied successfully in many
decision making problems, while their theoretical foundations are not as well
understood. In this paper, we study a Fitted Q-Iteration with two-layer ReLU
neural network parametrization, and find the sample complexity guarantees for
the algorithm. The approach estimates the Q-function in each iteration using a
convex optimization problem. We show that this approach achieves a sample
complexity of $\tilde{\mathcal{O}}(1/\epsilon^{2})$, which is order-optimal.
This result holds for a countable state-space and does not require any
assumptions such as a linear or low rank structure on the MDP.
Related papers
- Depth scaling of unstructured search via quantum approximate optimization [0.0]
Variational quantum algorithms have become the de facto model for current quantum computations.
One such problem is unstructured search which consists on finding a particular bit of string.
We trotterize a CTQW to recover a QAOA sequence, and employ recent advances on the theory of Trotter formulas to bound the query complexity.
arXiv Detail & Related papers (2024-03-22T18:00:03Z) - Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets)
Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z) - On the Global Convergence of Natural Actor-Critic with Two-layer Neural
Network Parametrization [38.32265770020665]
We study a natural actor-critic algorithm that utilizes neural networks to represent the critic.
Our aim is to establish sample complexity guarantees for this algorithm, achieving a deeper understanding of its performance characteristics.
arXiv Detail & Related papers (2023-06-18T06:22:04Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - Global Optimality Beyond Two Layers: Training Deep ReLU Networks via
Convex Programs [39.799125462526234]
We develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization.
We numerically validate our theoretical results via experiments involving both synthetic and real datasets.
arXiv Detail & Related papers (2021-10-11T18:00:30Z) - Q-Match: Iterative Shape Matching via Quantum Annealing [64.74942589569596]
Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP)
This paper proposes Q-Match, a new iterative quantum method for QAPs inspired by the alpha-expansion algorithm.
Q-Match can be applied for shape matching problems iteratively, on a subset of well-chosen correspondences, allowing us to scale to real-world problems.
arXiv Detail & Related papers (2021-05-06T17:59:38Z) - Tightening the Dependence on Horizon in the Sample Complexity of
Q-Learning [59.71676469100807]
This work sharpens the sample complexity of synchronous Q-learning to an order of $frac|mathcalS|| (1-gamma)4varepsilon2$ for any $0varepsilon 1$.
Our finding unveils the effectiveness of vanilla Q-learning, which matches that of speedy Q-learning without requiring extra computation and storage.
arXiv Detail & Related papers (2021-02-12T14:22:05Z) - Vector-output ReLU Neural Network Problems are Copositive Programs:
Convex Analysis of Two Layer Networks and Polynomial-time Algorithms [29.975118690758126]
We describe the semi-output global dual of the two-layer vector-infinite ReLU neural network training problem.
We provide a solution which is guaranteed to be exact for certain classes of problems.
arXiv Detail & Related papers (2020-12-24T17:03:30Z) - Distributed Stochastic Consensus Optimization with Momentum for
Nonconvex Nonsmooth Problems [45.88640334610308]
This paper presents a new distributed optimization algorithm for non-smooth problems.
We show that the proposed algorithm can achieve an overcal communication.
Experiments are presented to illustrate the effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-10T13:12:21Z) - Finite-Time Analysis for Double Q-learning [50.50058000948908]
We provide the first non-asymptotic, finite-time analysis for double Q-learning.
We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an $epsilon$-accurate neighborhood of the global optimum.
arXiv Detail & Related papers (2020-09-29T18:48:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.