Related papers: On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

URL: http://arxiv.org/abs/2211.07675v1
Date: Mon, 14 Nov 2022 19:00:24 GMT
Title: On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization
Authors: Mudit Gaur, Vaneet Aggarwal, Mridul Aggarwal
Abstract summary: We study a Fitted Q-Iteration with two-layer ReLU neural network parametrization, and find the sample complexity guarantees for the algorithm. We show that this approach achieves a sample complexity of $tildemathcalO (1/epsilon2)$, which is order-optimal.
Score: 33.12181620473604
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Q-learning based algorithms have been applied successfully in many decision making problems, while their theoretical foundations are not as well understood. In this paper, we study a Fitted Q-Iteration with two-layer ReLU neural network parametrization, and find the sample complexity guarantees for the algorithm. The approach estimates the Q-function in each iteration using a convex optimization problem. We show that this approach achieves a sample complexity of $\tilde{\mathcal{O}}(1/\epsilon^{2})$, which is order-optimal. This result holds for a countable state-space and does not require any assumptions such as a linear or low rank structure on the MDP.

Related papers

Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization [64.99236464953032]
We propose a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.<n>By applying our hinge-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution, we achieve a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.
arXiv Detail & Related papers (2025-06-03T06:31:59Z)
Minimisation of Quasar-Convex Functions Using Random Zeroth-Order Oracles [1.712317731332484]
We show the performance of a random Gaussian smoothing zeroth-order (ZO) scheme for minimising a quasarsar Theoretical (QC) functions in both unconstrained and constrained settings.<n>Findings are illustrated through investigating the performance of the algorithm applied to a range of problems in machine learning.
arXiv Detail & Related papers (2025-05-04T22:43:57Z)
Depth scaling of unstructured search via quantum approximate optimization [0.0]
Variational quantum algorithms have become the de facto model for current quantum computations. One such problem is unstructured search which consists on finding a particular bit of string. We trotterize a CTQW to recover a QAOA sequence, and employ recent advances on the theory of Trotter formulas to bound the query complexity.
arXiv Detail & Related papers (2024-03-22T18:00:03Z)
Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z)
On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization [38.32265770020665]
We study a natural actor-critic algorithm that utilizes neural networks to represent the critic. Our aim is to establish sample complexity guarantees for this algorithm, achieving a deeper understanding of its performance characteristics.
arXiv Detail & Related papers (2023-06-18T06:22:04Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape. We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z)
Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs [39.799125462526234]
We develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization. We numerically validate our theoretical results via experiments involving both synthetic and real datasets.
arXiv Detail & Related papers (2021-10-11T18:00:30Z)
Q-Match: Iterative Shape Matching via Quantum Annealing [64.74942589569596]
Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP) This paper proposes Q-Match, a new iterative quantum method for QAPs inspired by the alpha-expansion algorithm. Q-Match can be applied for shape matching problems iteratively, on a subset of well-chosen correspondences, allowing us to scale to real-world problems.
arXiv Detail & Related papers (2021-05-06T17:59:38Z)
Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel Optimization [62.87181271021217]
We consider non-SBO problems that have many applications in machine learning. This paper proposes fast randomized algorithms for non-SBO problems.
arXiv Detail & Related papers (2021-05-05T18:28:42Z)
Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning [59.71676469100807]
This work sharpens the sample complexity of synchronous Q-learning to an order of $frac|mathcalS|| (1-gamma)4varepsilon2$ for any $0varepsilon 1$. Our finding unveils the effectiveness of vanilla Q-learning, which matches that of speedy Q-learning without requiring extra computation and storage.
arXiv Detail & Related papers (2021-02-12T14:22:05Z)
Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms [29.975118690758126]
We describe the semi-output global dual of the two-layer vector-infinite ReLU neural network training problem. We provide a solution which is guaranteed to be exact for certain classes of problems.
arXiv Detail & Related papers (2020-12-24T17:03:30Z)
Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth Problems [45.88640334610308]
This paper presents a new distributed optimization algorithm for non-smooth problems. We show that the proposed algorithm can achieve an overcal communication. Experiments are presented to illustrate the effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-10T13:12:21Z)
Finite-Time Analysis for Double Q-learning [50.50058000948908]
We provide the first non-asymptotic, finite-time analysis for double Q-learning. We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an $epsilon$-accurate neighborhood of the global optimum.
arXiv Detail & Related papers (2020-09-29T18:48:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.