Entropy-Regularized Partially Observed Markov Decision Processes
- URL: http://arxiv.org/abs/2112.12255v1
- Date: Wed, 22 Dec 2021 22:44:44 GMT
- Title: Entropy-Regularized Partially Observed Markov Decision Processes
- Authors: Timothy L. Molloy, Girish N. Nair
- Abstract summary: We investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty.
Standard POMDP techniques are shown to offer bounded-error solutions to entropy-regularized POMDPs.
Our joint-entropy result is particularly surprising since it constitutes a novel, tractable formulation of active state estimation.
- Score: 3.42658286826597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate partially observed Markov decision processes (POMDPs) with
cost functions regularized by entropy terms describing state, observation, and
control uncertainty. Standard POMDP techniques are shown to offer bounded-error
solutions to these entropy-regularized POMDPs, with exact solutions when the
regularization involves the joint entropy of the state, observation, and
control trajectories. Our joint-entropy result is particularly surprising since
it constitutes a novel, tractable formulation of active state estimation.
Related papers
- Efficient optimization and conceptual barriers in variational finite Projected Entangled-Pair States [0.0]
Projected entangled pair states (PEPS) on finite two-dimensional lattices are a natural ansatz for representing ground states of local many-body Hamiltonians.
We propose the optimization of PEPS via an improved formulation of the time-dependent variational principle (TDVP)
We demonstrate our approach's capability to naturally handle long-range interactions by exploring the phase diagram of Rydberg atom arrays with long-range interactions.
arXiv Detail & Related papers (2025-03-16T16:06:44Z) - Entropic Matching for Expectation Propagation of Markov Jump Processes [38.60042579423602]
We propose a new tractable inference scheme based on an entropic matching framework.
We demonstrate the effectiveness of our method by providing closed-form results for a simple family of approximate distributions.
We derive expressions for point estimation of the underlying parameters using an approximate expectation procedure.
arXiv Detail & Related papers (2023-09-27T12:07:21Z) - Conditional fluctuation theorems and entropy production for monitored quantum systems under imperfect detection [0.7864304771129751]
We find a universal fluctuation relation that links thermodynamic entropy production and information-theoretical irreversibility along single trajectories in inefficient monitoring setups.
We illustrate our findings with a driven-dissipative two-level system following quantum jump trajectories and discuss the experimental applicability of our results for thermodynamic inference.
arXiv Detail & Related papers (2023-08-16T16:47:21Z) - Learning non-Markovian Decision-Making from State-only Sequences [57.20193609153983]
We develop a model-based imitation of state-only sequences with non-Markov Decision Process (nMDP)
We demonstrate the efficacy of the proposed method in a path planning task with non-Markovian constraints.
arXiv Detail & Related papers (2023-06-27T02:26:01Z) - Analysis of the Relative Entropy Asymmetry in the Regularization of
Empirical Risk Minimization [70.540936204654]
The effect of the relative entropy asymmetry is analyzed in the empirical risk minimization with relative entropy regularization (ERM-RER) problem.
A novel regularization is introduced, coined Type-II regularization, that allows for solutions to the ERM-RER problem with a support that extends outside the support of the reference measure.
arXiv Detail & Related papers (2023-06-12T13:56:28Z) - Sequential Stochastic Optimization in Separable Learning Environments [0.0]
We consider a class of sequential decision-making problems under uncertainty that can encompass various types of supervised learning concepts.
These problems have a completely observed state process and a partially observed modulation process, where the state process is affected by the modulation process only through an observation process.
We model this broad class of problems as a partially observed Markov decision process (POMDP)
arXiv Detail & Related papers (2021-08-21T21:29:04Z) - Smoother Entropy for Active State Trajectory Estimation and Obfuscation
in POMDPs [3.42658286826597]
optimisation of the smoother entropy leads to superior trajectory estimation and obfuscation compared to alternative approaches.
We identify belief-state MDP reformulations of both active estimation and obfuscation with concave cost and cost-to-go functions.
arXiv Detail & Related papers (2021-08-19T00:05:55Z) - Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit
Partial Observability [92.95794652625496]
Generalization is a central challenge for the deployment of reinforcement learning systems.
We show that generalization to unseen test conditions from a limited number of training conditions induces implicit partial observability.
We recast the problem of generalization in RL as solving the induced partially observed Markov decision process.
arXiv Detail & Related papers (2021-07-13T17:59:25Z) - Controller Synthesis for Omega-Regular and Steady-State Specifications [9.901800502055929]
We present an algorithm to find a deterministic policy satisfying $omega$-regular and steady-state constraints.
We experimentally evaluate our approach.
arXiv Detail & Related papers (2021-06-05T19:34:22Z) - Catalytic Transformations of Pure Entangled States [62.997667081978825]
Entanglement entropy is the von Neumann entropy of quantum entanglement of pure states.
The relation between entanglement entropy and entanglement distillation has been known only for the setting, and the meaning of entanglement entropy in the single-copy regime has so far remained open.
Our results imply that entanglement entropy quantifies the amount of entanglement available in a bipartite pure state to be used for quantum information processing, giving results an operational meaning also in entangled single-copy setup.
arXiv Detail & Related papers (2021-02-22T16:05:01Z) - Identification of Unexpected Decisions in Partially Observable
Monte-Carlo Planning: a Rule-Based Approach [78.05638156687343]
We propose a methodology for analyzing POMCP policies by inspecting their traces.
The proposed method explores local properties of policy behavior to identify unexpected decisions.
We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to mobile robot navigation.
arXiv Detail & Related papers (2020-12-23T15:09:28Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.