Related papers: Approximation to Deep Q-Network by Stochastic Delay Differential Equations

Approximation to Deep Q-Network by Stochastic Delay Differential Equations

URL: http://arxiv.org/abs/2505.00382v1
Date: Thu, 01 May 2025 08:19:24 GMT
Title: Approximation to Deep Q-Network by Stochastic Delay Differential Equations
Authors: Jianya Lu, Yingjun Mo,
Abstract summary: We construct a differential delay equation based on the Deep Q-Network algorithm and estimate the Wasserstein-1 distance between them.<n>We prove that the distance between the two converges to zero as the step size approaches zero.<n>Specifically, the delay term in the equation, corresponding to the target network, contributes to the stability of the system.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the significant breakthroughs that the Deep Q-Network (DQN) has brought to reinforcement learning, its theoretical analysis remains limited. In this paper, we construct a stochastic differential delay equation (SDDE) based on the DQN algorithm and estimate the Wasserstein-1 distance between them. We provide an upper bound for the distance and prove that the distance between the two converges to zero as the step size approaches zero. This result allows us to understand DQN's two key techniques, the experience replay and the target network, from the perspective of continuous systems. Specifically, the delay term in the equation, corresponding to the target network, contributes to the stability of the system. Our approach leverages a refined Lindeberg principle and an operator comparison to establish these results.

Related papers

Universal Approximation Theorem of Deep Q-Networks [2.1756081703276]
We analyze Deep Q-Networks (DQNs) via control and Forward-Backward Differential Equations (FBSDEs)<n>We show that DQNs can approximate the optimal Q-function on compact sets with arbitrary accuracy and high probability.<n>This work bridges deep reinforcement learning and control, offering insights into DQNs in continuous-time settings.
arXiv Detail & Related papers (2025-05-04T22:57:33Z)
Physics-informed reduced order model with conditional neural fields [4.5355909674008865]
This study presents the conditional neural fields for reduced-order modeling (CNF-ROM) framework to approximate solutions of parametrized partial differential equations (PDEs)<n>The approach combines a parametric neural ODE for modeling latent dynamics over time with a decoder that reconstructs PDE solutions from the corresponding latent states.
arXiv Detail & Related papers (2024-12-06T18:04:33Z)
Correctness Verification of Neural Networks Approximating Differential Equations [0.0]
Neural Networks (NNs) approximate the solution of Partial Differential Equations (PDEs) NNs can become integral parts of simulation software tools which can accelerate the simulation of complex dynamic systems more than 100 times. This work addresses the verification of these functions by defining the NN derivative as a finite difference approximation. For the first time, we tackle the problem of bounding an NN function without a priori knowledge of the output domain.
arXiv Detail & Related papers (2024-02-12T12:55:35Z)
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$-Greedy Exploration [86.71396285956044]
This paper provides a theoretical understanding of Deep Q-Network (DQN) with the $varepsilon$-greedy exploration in deep reinforcement learning.
arXiv Detail & Related papers (2023-10-24T20:37:02Z)
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations [64.78260098263489]
In this work, we assess the ability of physics-informed neural networks (PINNs) to solve increasingly-complex coupled ordinary differential equations (ODEs) We show that PINNs eventually fail to produce correct solutions to these benchmarks as their complexity increases. We identify several reasons why this may be the case, including insufficient network capacity, poor conditioning of the ODEs, and high local curvature, as measured by the Laplacian of the PINN loss.
arXiv Detail & Related papers (2022-10-14T15:01:32Z)
Neural Basis Functions for Accelerating Solutions to High Mach Euler Equations [63.8376359764052]
We propose an approach to solving partial differential equations (PDEs) using a set of neural networks. We regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE.
arXiv Detail & Related papers (2022-08-02T18:27:13Z)
Unified Field Theory for Deep and Recurrent Neural Networks [56.735884560668985]
We present a unified and systematic derivation of the mean-field theory for both recurrent and deep networks. We find that convergence towards the mean-field theory is typically slower for recurrent networks than for deep networks. Our method exposes that Gaussian processes are but the lowest order of a systematic expansion in $1/n$.
arXiv Detail & Related papers (2021-12-10T15:06:11Z)
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent. We show that SGD is biased towards a simple solution. We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z)
Finite-Time Analysis for Double Q-learning [50.50058000948908]
We provide the first non-asymptotic, finite-time analysis for double Q-learning. We show that both synchronous and asynchronous double Q-learning are guaranteed to converge to an $epsilon$-accurate neighborhood of the global optimum.
arXiv Detail & Related papers (2020-09-29T18:48:21Z)
FiniteNet: A Fully Convolutional LSTM Network Architecture for Time-Dependent Partial Differential Equations [0.0]
We use a fully convolutional LSTM network to exploit the dynamics of PDEs. We show that our network can reduce error by a factor of 2 to 3 compared to the baseline algorithms.
arXiv Detail & Related papers (2020-02-07T21:18:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.