Related papers: Constructing Non-Markovian Decision Process via History Aggregator

Constructing Non-Markovian Decision Process via History Aggregator

URL: http://arxiv.org/abs/2506.24026v1
Date: Mon, 30 Jun 2025 16:32:31 GMT
Title: Constructing Non-Markovian Decision Process via History Aggregator
Authors: Yongyi Wang, Wenxin Li,
Abstract summary: We establish the category of Markov Decision Processes (MDP) and the category of non-Markovian Decision Processes (NMDP)<n>We introduce non-Markovianity into decision-making problem settings via the History Aggregator for State (HAS)<n>Our analysis demonstrates the effectiveness of our method in representing a broad range of non-Markovian dynamics.
Score: 0.9918339315515408
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the domain of algorithmic decision-making, non-Markovian dynamics manifest as a significant impediment, especially for paradigms such as Reinforcement Learning (RL), thereby exerting far-reaching consequences on the advancement and effectiveness of the associated systems. Nevertheless, the existing benchmarks are deficient in comprehensively assessing the capacity of decision algorithms to handle non-Markovian dynamics. To address this deficiency, we have devised a generalized methodology grounded in category theory. Notably, we established the category of Markov Decision Processes (MDP) and the category of non-Markovian Decision Processes (NMDP), and proved the equivalence relationship between them. This theoretical foundation provides a novel perspective for understanding and addressing non-Markovian dynamics. We further introduced non-Markovianity into decision-making problem settings via the History Aggregator for State (HAS). With HAS, we can precisely control the state dependency structure of decision-making problems in the time series. Our analysis demonstrates the effectiveness of our method in representing a broad range of non-Markovian dynamics. This approach facilitates a more rigorous and flexible evaluation of decision algorithms by testing them in problem settings where non-Markovian dynamics are explicitly constructed.

Related papers

A New Approach for Multicriteria Assessment in the Ranking of Alternatives Using Cardinal and Ordinal Data [0.0]
We propose a novel MCA approach that combines two Virtual Gap Analysis (VGA) models.<n>The VGA framework, rooted in linear programming, is pivotal in the MCA methodology.
arXiv Detail & Related papers (2025-07-10T04:00:48Z)
Reinforcement Learning in Switching Non-Stationary Markov Decision Processes: Algorithms and Convergence Analysis [6.399565088857091]
We introduce Switching Non-Stationary Markov Decision Processes (SNS-MDP), where environments switch over time based on an underlying Markov chain.<n>Under a fixed policy, the value function of an SNS-MDP admits a closed-form solution determined by the Markov chain's statistical properties.<n>We show how this framework can effectively guide decision-making in complex, time-varying contexts.
arXiv Detail & Related papers (2025-03-24T12:05:30Z)
Deep Belief Markov Models for POMDP Inference [0.40498500266986387]
This work introduces a novel deep learning-based architecture, termed the Deep Belief Markov Model (DBMM)<n>DBMM provides efficient, model-formulation inference in Partially Observable Markov Decision Process (POMDP) problems.<n>We evaluate the efficacy of the proposed methodology by evaluating the capability of model-formulation agnostic inference of DBMMs in benchmark problems.
arXiv Detail & Related papers (2025-03-17T17:58:45Z)
On the Foundation of Distributionally Robust Reinforcement Learning [19.621038847810198]
We contribute to the theoretical foundation of distributionally robust reinforcement learning (DRRL) This framework obliges the decision maker to choose an optimal policy under the worst-case distributional shift orchestrated by an adversary. Within this DRMDP framework, we investigate conditions for the existence or absence of the dynamic programming principle (DPP)
arXiv Detail & Related papers (2023-11-15T15:02:23Z)
$\lambda$-models: Effective Decision-Aware Reinforcement Learning with Latent Models [11.826471893069805]
We present a study on the necessary components for decision-aware reinforcement learning models. We highlight that empirical design decisions are vital to achieving good performance for related algorithms. We show that the use of the MuZero loss function is biased in environments and establish that this bias has practical consequences.
arXiv Detail & Related papers (2023-06-30T02:06:45Z)
Inference and dynamic decision-making for deteriorating systems with probabilistic dependencies through Bayesian networks and deep reinforcement learning [0.0]
We propose an efficient algorithmic framework for inference and decision-making under uncertainty for engineering systems exposed to deteriorating environments. In terms of policy optimization, we adopt a deep decentralized multi-agent actor-critic (DDMAC) reinforcement learning approach. Results demonstrate that DDMAC policies offer substantial benefits when compared to state-of-the-art approaches.
arXiv Detail & Related papers (2022-09-02T14:45:40Z)
On the Complexity of Adversarial Decision Making [101.14158787665252]
We show that the Decision-Estimation Coefficient is necessary and sufficient to obtain low regret for adversarial decision making. We provide new structural results that connect the Decision-Estimation Coefficient to variants of other well-known complexity measures.
arXiv Detail & Related papers (2022-06-27T06:20:37Z)
Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds. We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z)
Markov Abstractions for PAC Reinforcement Learning in Non-Markov Decision Processes [90.53326983143644]
We show that Markov abstractions can be learned during reinforcement learning. We show that our approach has PAC guarantees when the employed algorithms have PAC guarantees.
arXiv Detail & Related papers (2022-04-29T16:53:00Z)
Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment [79.5678820246642]
We show that certain action-value methods are more sample efficient than policy-gradient methods on transfer problems that require only sparse changes to a sequence of previously optimal decisions. We generalize the recently proposed societal decision-making framework as a more granular formalism than the Markov decision process.
arXiv Detail & Related papers (2021-06-28T21:29:13Z)
Identification of Unexpected Decisions in Partially Observable Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces. The proposed method explores local properties of policy behavior to identify unexpected decisions. We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z)
Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems. Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions. We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.