The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
- URL: http://arxiv.org/abs/2406.12795v1
- Date: Tue, 18 Jun 2024 17:00:13 GMT
- Title: The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
- Authors: Riccardo Zamboni, Duilio Cirino, Marcello Restelli, Mirco Mutti,
- Abstract summary: We study a simple approach of maximizing the entropy over observations in place true latent states.
We show how knowledge of the latter can be exploited to compute a regularization of the observation entropy to improve principled performance.
- Score: 40.82741665804367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of pure exploration in Markov decision processes has been cast as maximizing the entropy over the state distribution induced by the agent's policy, an objective that has been extensively studied. However, little attention has been dedicated to state entropy maximization under partial observability, despite the latter being ubiquitous in applications, e.g., finance and robotics, in which the agent only receives noisy observations of the true state governing the system's dynamics. How can we address state entropy maximization in those domains? In this paper, we study the simple approach of maximizing the entropy over observations in place of true latent states. First, we provide lower and upper bounds to the approximation of the true state entropy that only depends on some properties of the observation function. Then, we show how knowledge of the latter can be exploited to compute a principled regularization of the observation entropy to improve performance. With this work, we provide both a flexible approach to bring advances in state entropy maximization to the POMDP setting and a theoretical characterization of its intrinsic limits.
Related papers
- How to Explore with Belief: State Entropy Maximization in POMDPs [40.82741665804367]
We develop a memory and efficient *policy* method to address a first-order relaxation of the objective defined on ** states.
This paper aims to generalize state entropy to more realistic domains that meet the challenges of applications.
arXiv Detail & Related papers (2024-06-04T13:16:34Z) - Heat and Work in Quantum Thermodynamics: a Cybernetic Approach [0.0]
We present a new proposal for distinguishing heat from work based on a control-theoretic observability decomposition.
We derive a Hermitian operator representing instantaneous dissipation of observable energy, and suggest a generalization of the von-Neumann entropy.
arXiv Detail & Related papers (2024-03-04T13:26:48Z) - Observational entropic study of Anderson localization [0.0]
We study the behaviour of the observational entropy in the context of localization-delocalization transition for one-dimensional Aubrey-Andr'e model.
For a given coarse-graining, it increases logarithmically with system size in the delocalized phase, and obeys area law in the localized phase.
We also find the increase of the observational entropy followed by the quantum quench, is logarithmic in time in the delocalized phase as well as at the transition point, while in the localized phase it oscillates.
arXiv Detail & Related papers (2022-09-21T11:26:43Z) - Observational entropy, coarse quantum states, and Petz recovery:
information-theoretic properties and bounds [1.7205106391379026]
We study the mathematical properties of observational entropy from an information-theoretic viewpoint.
We present new bounds on observational entropy applying in general, as well as bounds and identities related to sequential and post-processed measurements.
arXiv Detail & Related papers (2022-09-08T13:22:15Z) - IRL with Partial Observations using the Principle of Uncertain Maximum
Entropy [8.296684637620553]
We introduce the principle of uncertain maximum entropy and present an expectation-maximization based solution.
We experimentally demonstrate the improved robustness to noisy data offered by our technique in a maximum causal entropy inverse reinforcement learning domain.
arXiv Detail & Related papers (2022-08-15T03:22:46Z) - Computationally Efficient PAC RL in POMDPs with Latent Determinism and
Conditional Embeddings [97.12538243736705]
We study reinforcement learning with function approximation for large-scale Partially Observable Decision Processes (POMDPs)
Our algorithm provably scales to large-scale POMDPs.
arXiv Detail & Related papers (2022-06-24T05:13:35Z) - Maximum entropy quantum state distributions [58.720142291102135]
We go beyond traditional thermodynamics and condition on the full distribution of the conserved quantities.
The result are quantum state distributions whose deviations from thermal states' get more pronounced in the limit of wide input distributions.
arXiv Detail & Related papers (2022-03-23T17:42:34Z) - Tight Exponential Analysis for Smoothing the Max-Relative Entropy and
for Quantum Privacy Amplification [56.61325554836984]
The max-relative entropy together with its smoothed version is a basic tool in quantum information theory.
We derive the exact exponent for the decay of the small modification of the quantum state in smoothing the max-relative entropy based on purified distance.
arXiv Detail & Related papers (2021-11-01T16:35:41Z) - Action Redundancy in Reinforcement Learning [54.291331971813364]
We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy.
Our results suggest that action redundancy is a fundamental problem in reinforcement learning.
arXiv Detail & Related papers (2021-02-22T19:47:26Z) - Catalytic Transformations of Pure Entangled States [62.997667081978825]
Entanglement entropy is the von Neumann entropy of quantum entanglement of pure states.
The relation between entanglement entropy and entanglement distillation has been known only for the setting, and the meaning of entanglement entropy in the single-copy regime has so far remained open.
Our results imply that entanglement entropy quantifies the amount of entanglement available in a bipartite pure state to be used for quantum information processing, giving results an operational meaning also in entangled single-copy setup.
arXiv Detail & Related papers (2021-02-22T16:05:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.