k-Means Maximum Entropy Exploration
- URL: http://arxiv.org/abs/2205.15623v4
- Date: Tue, 7 Nov 2023 10:35:45 GMT
- Title: k-Means Maximum Entropy Exploration
- Authors: Alexander Nedergaard, Matthew Cook
- Abstract summary: Exploration in continuous spaces with sparse rewards is an open problem in reinforcement learning.
We introduce an artificial curiosity algorithm based on lower bounding an approximation to the entropy of the state visitation distribution.
We show that our approach is both computationally efficient and competitive on benchmarks for exploration in high-dimensional, continuous spaces.
- Score: 55.81894038654918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploration in high-dimensional, continuous spaces with sparse rewards is an
open problem in reinforcement learning. Artificial curiosity algorithms address
this by creating rewards that lead to exploration. Given a reinforcement
learning algorithm capable of maximizing rewards, the problem reduces to
finding an optimization objective consistent with exploration. Maximum entropy
exploration uses the entropy of the state visitation distribution as such an
objective. However, efficiently estimating the entropy of the state visitation
distribution is challenging in high-dimensional, continuous spaces. We
introduce an artificial curiosity algorithm based on lower bounding an
approximation to the entropy of the state visitation distribution. The bound
relies on a result we prove for non-parametric density estimation in arbitrary
dimensions using k-means. We show that our approach is both computationally
efficient and competitive on benchmarks for exploration in high-dimensional,
continuous spaces, especially on tasks where reinforcement learning algorithms
are unable to find rewards.
Related papers
- Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and
Optimal Algorithms [64.10576998630981]
We show the first tight characterization of the optimal Hessian-dependent sample complexity.
A Hessian-independent algorithm universally achieves the optimal sample complexities for all Hessian instances.
The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions.
arXiv Detail & Related papers (2023-06-21T17:03:22Z) - Structural Estimation of Markov Decision Processes in High-Dimensional
State Space with Finite-Time Guarantees [39.287388288477096]
We consider the task of estimating a structural model of dynamic decisions by a human agent based upon the observable history of implemented actions and visited states.
This problem has an inherent nested structure: in the inner problem, an optimal policy for a given reward function is identified while in the outer problem, a measure of fit is maximized.
We propose a single-loop estimation algorithm with finite time guarantees that is equipped to deal with high-dimensional state spaces.
arXiv Detail & Related papers (2022-10-04T00:11:38Z) - Rewarding Episodic Visitation Discrepancy for Exploration in
Reinforcement Learning [64.8463574294237]
We propose Rewarding Episodic Visitation Discrepancy (REVD) as an efficient and quantified exploration method.
REVD provides intrinsic rewards by evaluating the R'enyi divergence-based visitation discrepancy between episodes.
It is tested on PyBullet Robotics Environments and Atari games.
arXiv Detail & Related papers (2022-09-19T08:42:46Z) - On Reward-Free RL with Kernel and Neural Function Approximations:
Single-Agent MDP and Markov Game [140.19656665344917]
We study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function.
We tackle this problem under the context of function approximation, leveraging powerful function approximators.
We establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.
arXiv Detail & Related papers (2021-10-19T07:26:33Z) - State Entropy Maximization with Random Encoders for Efficient
Exploration [162.39202927681484]
Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL)
This paper presents Randoms for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward.
In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly encoder.
arXiv Detail & Related papers (2021-02-18T15:45:17Z) - Leveraging Reinforcement Learning for evaluating Robustness of KNN
Search Algorithms [0.0]
The problem of finding K-nearest neighbors in the given dataset for a given query point has been worked upon since several years.
In this paper, we survey some novel K-Nearest Neighbor Search approaches that tackles the problem of Search from the perspectives of computations.
In order to evaluate the robustness of a KNNS approach against adversarial points, we propose a generic Reinforcement Learning based framework for the same.
arXiv Detail & Related papers (2021-02-10T16:10:58Z) - Geometric Entropic Exploration [52.67987687712534]
We introduce a new algorithm that maximises the geometry-aware Shannon entropy of state-visits in both discrete and continuous domains.
Our key theoretical contribution is casting geometry-aware MSVE exploration as a tractable problem of optimising a simple and novel noise-contrastive objective function.
In our experiments, we show the efficiency of GEM in solving several RL problems with sparse rewards, compared against other deep RL exploration approaches.
arXiv Detail & Related papers (2021-01-06T14:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.