Action Redundancy in Reinforcement Learning
- URL: http://arxiv.org/abs/2102.11329v1
- Date: Mon, 22 Feb 2021 19:47:26 GMT
- Title: Action Redundancy in Reinforcement Learning
- Authors: Nir Baram, Guy Tennenholtz, Shie Mannor
- Abstract summary: We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy.
Our results suggest that action redundancy is a fundamental problem in reinforcement learning.
- Score: 54.291331971813364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning
paradigm which seeks to maximize return under entropy regularization. However,
action entropy does not necessarily coincide with state entropy, e.g., when
multiple actions produce the same transition. Instead, we propose to maximize
the transition entropy, i.e., the entropy of next states. We show that
transition entropy can be described by two terms; namely, model-dependent
transition entropy and action redundancy. Particularly, we explore the latter
in both deterministic and stochastic settings and develop tractable
approximation methods in a near model-free setup. We construct algorithms to
minimize action redundancy and demonstrate their effectiveness on a synthetic
environment with multiple redundant actions as well as contemporary benchmarks
in Atari and Mujoco. Our results suggest that action redundancy is a
fundamental problem in reinforcement learning.
Related papers
- The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough [40.82741665804367]
We study a simple approach of maximizing the entropy over observations in place true latent states.
We show how knowledge of the latter can be exploited to compute a regularization of the observation entropy to improve principled performance.
arXiv Detail & Related papers (2024-06-18T17:00:13Z) - Entropy Production from Maximum Entropy Principle: a Unifying Approach [0.0]
Entropy production is the crucial quantity characterizing irreversible phenomena and the second law of thermodynamics.
We use Jaynes' maximum entropy principle to establish a framework that brings together prominent and apparently conflicting definitions.
arXiv Detail & Related papers (2024-01-18T12:32:45Z) - A general Markov decision process formalism for action-state
entropy-regularized reward maximization [0.0]
Previous work has addressed different forms of action, state and action-state entropy regularization, pure exploration and space occupation.
These problems have become extremely relevant for regularization, generalization and learning.
arXiv Detail & Related papers (2023-02-02T13:40:12Z) - Quantum R\'enyi entropy by optimal thermodynamic integration paths [0.0]
We introduce here a theoretical framework based on an optimal thermodynamic integration scheme, where the R'enyi entropy can be efficiently evaluated.
We demonstrate it in the one-dimensional quantum Ising model and perform the evaluation of entanglement entropy in the formic acid dimer.
arXiv Detail & Related papers (2021-12-28T15:59:15Z) - Model based Multi-agent Reinforcement Learning with Tensor
Decompositions [52.575433758866936]
This paper investigates generalisation in state-action space over unexplored state-action pairs by modelling the transition and reward functions as tensors of low CP-rank.
Experiments on synthetic MDPs show that using tensor decompositions in a model-based reinforcement learning algorithm can lead to much faster convergence if the true transition and reward functions are indeed of low rank.
arXiv Detail & Related papers (2021-10-27T15:36:25Z) - Open-system approach to nonequilibrium quantum thermodynamics at
arbitrary coupling [77.34726150561087]
We develop a general theory describing the thermodynamical behavior of open quantum systems coupled to thermal baths.
Our approach is based on the exact time-local quantum master equation for the reduced open system states.
arXiv Detail & Related papers (2021-09-24T11:19:22Z) - Aspects of Pseudo Entropy in Field Theories [0.0]
We numerically analyze a class of free scalar field theories and the XY spin model.
This reveals the basic properties of pseudo entropy in many-body systems.
We find that the non-positivity of the difference can be violated only if the initial and final states belong to different quantum phases.
arXiv Detail & Related papers (2021-06-06T13:25:35Z) - Maximum Entropy Reinforcement Learning with Mixture Policies [54.291331971813364]
We construct a tractable approximation of the mixture entropy using MaxEnt algorithms.
We show that it is closely related to the sum of marginal entropies.
We derive an algorithmic variant of Soft Actor-Critic (SAC) to the mixture policy case and evaluate it on a series of continuous control tasks.
arXiv Detail & Related papers (2021-03-18T11:23:39Z) - Catalytic Transformations of Pure Entangled States [62.997667081978825]
Entanglement entropy is the von Neumann entropy of quantum entanglement of pure states.
The relation between entanglement entropy and entanglement distillation has been known only for the setting, and the meaning of entanglement entropy in the single-copy regime has so far remained open.
Our results imply that entanglement entropy quantifies the amount of entanglement available in a bipartite pure state to be used for quantum information processing, giving results an operational meaning also in entangled single-copy setup.
arXiv Detail & Related papers (2021-02-22T16:05:01Z) - Entropy production in the quantum walk [62.997667081978825]
We focus on the study of the discrete-time quantum walk on the line, from the entropy production perspective.
We argue that the evolution of the coin can be modeled as an open two-level system that exchanges energy with the lattice at some effective temperature.
arXiv Detail & Related papers (2020-04-09T23:18:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.