Geometric Entropic Exploration
- URL: http://arxiv.org/abs/2101.02055v2
- Date: Thu, 7 Jan 2021 12:57:27 GMT
- Title: Geometric Entropic Exploration
- Authors: Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu
Thakoor, Bilal Piot, Bernardo Avila Pires, Michal Valko, Thomas Mesnard, Tor
Lattimore, R\'emi Munos
- Abstract summary: We introduce a new algorithm that maximises the geometry-aware Shannon entropy of state-visits in both discrete and continuous domains.
Our key theoretical contribution is casting geometry-aware MSVE exploration as a tractable problem of optimising a simple and novel noise-contrastive objective function.
In our experiments, we show the efficiency of GEM in solving several RL problems with sparse rewards, compared against other deep RL exploration approaches.
- Score: 52.67987687712534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploration is essential for solving complex Reinforcement Learning (RL)
tasks. Maximum State-Visitation Entropy (MSVE) formulates the exploration
problem as a well-defined policy optimization problem whose solution aims at
visiting all states as uniformly as possible. This is in contrast to standard
uncertainty-based approaches where exploration is transient and eventually
vanishes. However, existing approaches to MSVE are theoretically justified only
for discrete state-spaces as they are oblivious to the geometry of continuous
domains. We address this challenge by introducing Geometric Entropy
Maximisation (GEM), a new algorithm that maximises the geometry-aware Shannon
entropy of state-visits in both discrete and continuous domains. Our key
theoretical contribution is casting geometry-aware MSVE exploration as a
tractable problem of optimising a simple and novel noise-contrastive objective
function. In our experiments, we show the efficiency of GEM in solving several
RL problems with sparse rewards, compared against other deep RL exploration
approaches.
Related papers
- Combinatorial Optimization with Policy Adaptation using Latent Space Search [44.12073954093942]
We present a novel approach for designing performant algorithms to solve complex, typically NP-hard, problems.
We show that our search strategy outperforms state-of-the-art approaches on 11 standard benchmarking tasks.
arXiv Detail & Related papers (2023-11-13T12:24:54Z) - Exploring the topological sector optimization on quantum computers [5.458469081464264]
topological sector optimization (TSO) problem attracts particular interests in the quantum many-body physics community.
We demonstrate that the optimization difficulties of TSO problem are not restricted to the gaplessness, but are also due to the topological nature.
To solve TSO problems, we utilize quantum imaginary time evolution (QITE) with a possible realization on quantum computers.
arXiv Detail & Related papers (2023-10-06T14:51:07Z) - Energy-Guided Continuous Entropic Barycenter Estimation for General Costs [95.33926437521046]
We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functions.
Our approach is built upon the dual reformulation of the EOT problem based on weak OT.
arXiv Detail & Related papers (2023-10-02T11:24:36Z) - Near-optimal Policy Identification in Active Reinforcement Learning [84.27592560211909]
AE-LSVI is a novel variant of the kernelized least-squares value RL (LSVI) algorithm that combines optimism with pessimism for active exploration.
We show that AE-LSVI outperforms other algorithms in a variety of environments when robustness to the initial state is required.
arXiv Detail & Related papers (2022-12-19T14:46:57Z) - k-Means Maximum Entropy Exploration [55.81894038654918]
Exploration in continuous spaces with sparse rewards is an open problem in reinforcement learning.
We introduce an artificial curiosity algorithm based on lower bounding an approximation to the entropy of the state visitation distribution.
We show that our approach is both computationally efficient and competitive on benchmarks for exploration in high-dimensional, continuous spaces.
arXiv Detail & Related papers (2022-05-31T09:05:58Z) - Minimax Optimization: The Case of Convex-Submodular [50.03984152441271]
Minimax problems extend beyond the continuous domain to mixed continuous-discrete domains or even fully discrete domains.
We introduce the class of convex-submodular minimax problems, where the objective is convex with respect to the continuous variable and submodular with respect to the discrete variable.
Our proposed algorithms are iterative and combine tools from both discrete and continuous optimization.
arXiv Detail & Related papers (2021-11-01T21:06:35Z) - Result Diversification by Multi-objective Evolutionary Algorithms with
Theoretical Guarantees [94.72461292387146]
We propose to reformulate the result diversification problem as a bi-objective search problem, and solve it by a multi-objective evolutionary algorithm (EA)
We theoretically prove that the GSEMO can achieve the optimal-time approximation ratio, $1/2$.
When the objective function changes dynamically, the GSEMO can maintain this approximation ratio in running time, addressing the open question proposed by Borodin et al.
arXiv Detail & Related papers (2021-10-18T14:00:22Z) - Exploitation and Exploration Analysis of Elitist Evolutionary
Algorithms: A Case Study [6.48717002317456]
This paper proposes to evaluate exploitation and exploration via the success probability and the one-step improvement rate computed in different domains of integration.
Case studies are performed by analyzing performances of (1+1) random uni search and (1+1) evolutionary programming on the sphere function and the cheating problem.
arXiv Detail & Related papers (2020-01-29T16:21:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.