Related papers: Reinforcement Learning with Function Approximation for Non-Markov Processes

Reinforcement Learning with Function Approximation for Non-Markov Processes

URL: http://arxiv.org/abs/2601.00151v1
Date: Thu, 01 Jan 2026 00:56:18 GMT
Title: Reinforcement Learning with Function Approximation for Non-Markov Processes
Authors: Ali Devran Kara,
Abstract summary: We study reinforcement learning methods with linear function approximation under non-Markov state and cost processes.<n>We show that the algorithm converges under suitable ergodicity conditions on the underlying non-Markov processes.<n>We derive explicit error bounds for the limits of the resulting learning algorithms.
Score: 2.0136462287587675
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study reinforcement learning methods with linear function approximation under non-Markov state and cost processes. We first consider the policy evaluation method and show that the algorithm converges under suitable ergodicity conditions on the underlying non-Markov processes. Furthermore, we show that the limit corresponds to the fixed point of a joint operator composed of an orthogonal projection and the Bellman operator of an auxiliary \emph{Markov} decision process. For Q-learning with linear function approximation, as in the Markov setting, convergence is not guaranteed in general. We show, however, that for the special case where the basis functions are chosen based on quantization maps, the convergence can be shown under similar ergodicity conditions. Finally, we apply our results to partially observed Markov decision processes, where finite-memory variables are used as state representations, and we derive explicit error bounds for the limits of the resulting learning algorithms.

Related papers

Q-Measure-Learning for Continuous State RL: Efficient Implementation and Convergence [10.189658648290257]
We study reinforcement learning in infinite-horizon discounted Markov decision processes with continuous state spaces.<n>We propose the novel Q-Measure-Learning, which learns a signed empirical measure supported on visited state-action pairs.
arXiv Detail & Related papers (2026-03-03T21:15:22Z)
Uncertainty Quantification with Bayesian Higher Order ReLU KANs [0.0]
We introduce the first method of uncertainty quantification in the domain of Kolmogorov-Arnold Networks, specifically focusing on (Higher Order) ReLUKANs. We validate our method through a series of closure tests, including simple one-dimensional functions. We demonstrate the method's ability to correctly identify functional dependencies introduced through the inclusion of a term.
arXiv Detail & Related papers (2024-10-02T15:57:18Z)
Decomposable Transformer Point Processes [2.1756081703276]
We propose a framework where the advantages of the attention-based architecture are maintained and the limitation of the thinning algorithm is circumvented. The proposed method attains state-of-the-art performance in predicting the next event of a sequence given its history.
arXiv Detail & Related papers (2024-09-26T13:22:58Z)
Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z)
Regularized Q-Learning with Linear Function Approximation [2.765106384328772]
We consider a bi-level optimization formulation of regularized Q-learning with linear functional approximation.<n>We show that, under certain assumptions, the proposed algorithm converges to a stationary point in the presence of Markovian noise.
arXiv Detail & Related papers (2024-01-26T20:45:40Z)
Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities. We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability. The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z)
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency [111.83670279016599]
We study reinforcement learning for partially observed decision processes (POMDPs) with infinite observation and state spaces. We make the first attempt at partial observability and function approximation for a class of POMDPs with a linear structure.
arXiv Detail & Related papers (2022-04-20T21:15:38Z)
Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z)
A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states. We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z)
Parallel Stochastic Mirror Descent for MDPs [72.75921150912556]
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs) Some variant of Mirror Descent is proposed for convex programming problems with Lipschitz-continuous functionals. We analyze this algorithm in a general case and obtain an estimate of the convergence rate that does not accumulate errors during the operation of the method.
arXiv Detail & Related papers (2021-02-27T19:28:39Z)
Approximating Euclidean by Imprecise Markov Decision Processes [3.0017241250121383]
We investigate what kind of approximation guarantees are obtained when the Euclidean process is approximated by finite state approximations. We show that for cost functions over finite time horizons the approximations become arbitrarily precise.
arXiv Detail & Related papers (2020-06-26T11:58:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.