Related papers: Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality

Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality

URL: http://arxiv.org/abs/2503.17865v1
Date: Sat, 22 Mar 2025 21:16:08 GMT
Title: Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
Authors: Ruijia Zhang, Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong,
Abstract summary: We show that our algorithm can identify the globally optimal reward and policy under certain neural network structures.<n>This is the first IRL algorithm with a non-asymptotic convergence guarantee that provably achieves global optimality.
Score: 52.906438147288256
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The goal of the Inverse reinforcement learning (IRL) task is to identify the underlying reward function and the corresponding optimal policy from a set of expert demonstrations. While most IRL algorithms' theoretical guarantees rely on a linear reward structure, we aim to extend the theoretical understanding of IRL to scenarios where the reward function is parameterized by neural networks. Meanwhile, conventional IRL algorithms usually adopt a nested structure, leading to computational inefficiency, especially in high-dimensional settings. To address this problem, we propose the first two-timescale single-loop IRL algorithm under neural network parameterized reward and provide a non-asymptotic convergence analysis under overparameterization. Although prior optimality results for linear rewards do not apply, we show that our algorithm can identify the globally optimal reward and policy under certain neural network structures. This is the first IRL algorithm with a non-asymptotic convergence guarantee that provably achieves global optimality in neural network settings.

Related papers

Component-based Sketching for Deep ReLU Nets [55.404661149594375]
We develop a sketching scheme based on deep net components for various tasks. We transform deep net training into a linear empirical risk minimization problem. We show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions.
arXiv Detail & Related papers (2024-09-21T15:30:43Z)
Parallel-in-Time Solutions with Random Projection Neural Networks [0.07282584715927627]
This paper considers one of the fundamental parallel-in-time methods for the solution of ordinary differential equations, Parareal, and extends it by adopting a neural network as a coarse propagator. We provide a theoretical analysis of the convergence properties of the proposed algorithm and show its effectiveness for several examples, including Lorenz and Burgers' equations.
arXiv Detail & Related papers (2024-08-19T07:32:41Z)
Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [3.680127959836384]
implicit gradient descent (IGD) outperforms the common gradient descent (GD) in handling certain multi-scale problems. We show that IGD converges a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy. Many algorithms for IRL have an inherently nested structure. We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z)
Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm [29.978816372127085]
We present a finite-time analysis of Natural actor-critic (NAC) with neural network approximation. We identify the roles of neural networks, regularization and optimization techniques to achieve provably good performance.
arXiv Detail & Related papers (2022-06-02T02:13:29Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis [40.762447301225926]
We propose the particle dual averaging (PDA) method, which generalizes the dual averaging method in convex optimization. An important application of the proposed method is the optimization of two-layer neural network in the mean field regime. We show that neural networks in the mean field limit can be globally optimized by PDA.
arXiv Detail & Related papers (2020-12-31T07:07:32Z)
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks [23.038631072178735]
We consider a broad class of optimization algorithms that are commonly used in practice. As a consequence, we can leverage the convergence behavior of neural networks. We believe our approach can also be extended to other optimization algorithms and network theory.
arXiv Detail & Related papers (2020-10-25T17:10:22Z)
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy [119.12515258771302]
We show that a variant of PPOO equipped with over-parametrization converges to globally optimal networks. The key to our analysis is the iterate of infinite gradient under a notion of one-dimensional monotonicity, where the gradient and are instant by networks.
arXiv Detail & Related papers (2019-06-25T03:20:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.