Related papers: Geometric Entropic Exploration

Geometric Entropic Exploration

URL: http://arxiv.org/abs/2101.02055v2
Date: Thu, 7 Jan 2021 12:57:27 GMT
Title: Geometric Entropic Exploration
Authors: Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Alaa Saade, Shantanu Thakoor, Bilal Piot, Bernardo Avila Pires, Michal Valko, Thomas Mesnard, Tor Lattimore, R\'emi Munos
Abstract summary: We introduce a new algorithm that maximises the geometry-aware Shannon entropy of state-visits in both discrete and continuous domains. Our key theoretical contribution is casting geometry-aware MSVE exploration as a tractable problem of optimising a simple and novel noise-contrastive objective function. In our experiments, we show the efficiency of GEM in solving several RL problems with sparse rewards, compared against other deep RL exploration approaches.
Score: 52.67987687712534
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Exploration is essential for solving complex Reinforcement Learning (RL) tasks. Maximum State-Visitation Entropy (MSVE) formulates the exploration problem as a well-defined policy optimization problem whose solution aims at visiting all states as uniformly as possible. This is in contrast to standard uncertainty-based approaches where exploration is transient and eventually vanishes. However, existing approaches to MSVE are theoretically justified only for discrete state-spaces as they are oblivious to the geometry of continuous domains. We address this challenge by introducing Geometric Entropy Maximisation (GEM), a new algorithm that maximises the geometry-aware Shannon entropy of state-visits in both discrete and continuous domains. Our key theoretical contribution is casting geometry-aware MSVE exploration as a tractable problem of optimising a simple and novel noise-contrastive objective function. In our experiments, we show the efficiency of GEM in solving several RL problems with sparse rewards, compared against other deep RL exploration approaches.

Related papers

Towards Principled Multi-Agent Task Agnostic Exploration [44.601019677298005]
In reinforcement learning, we typically refer to task-agnostic exploration when we aim to explore the environment without access to the task specification a priori. In this paper, we address the question through a generalization to multiple agents of the problem of maximizing the state distribution entropy. We present a scalable, decentralized, trust-region policy search algorithm to address the problem in practical settings.
arXiv Detail & Related papers (2025-02-12T12:51:36Z)
Combinatorial Optimization with Policy Adaptation using Latent Space Search [44.12073954093942]
We present a novel approach for designing performant algorithms to solve complex, typically NP-hard, problems. We show that our search strategy outperforms state-of-the-art approaches on 11 standard benchmarking tasks.
arXiv Detail & Related papers (2023-11-13T12:24:54Z)
Exploring the topological sector optimization on quantum computers [5.458469081464264]
topological sector optimization (TSO) problem attracts particular interests in the quantum many-body physics community. We demonstrate that the optimization difficulties of TSO problem are not restricted to the gaplessness, but are also due to the topological nature. To solve TSO problems, we utilize quantum imaginary time evolution (QITE) with a possible realization on quantum computers.
arXiv Detail & Related papers (2023-10-06T14:51:07Z)
Energy-Guided Continuous Entropic Barycenter Estimation for General Costs [95.33926437521046]
We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functions. Our approach is built upon the dual reformulation of the EOT problem based on weak OT.
arXiv Detail & Related papers (2023-10-02T11:24:36Z)
Near-optimal Policy Identification in Active Reinforcement Learning [84.27592560211909]
AE-LSVI is a novel variant of the kernelized least-squares value RL (LSVI) algorithm that combines optimism with pessimism for active exploration. We show that AE-LSVI outperforms other algorithms in a variety of environments when robustness to the initial state is required.
arXiv Detail & Related papers (2022-12-19T14:46:57Z)
k-Means Maximum Entropy Exploration [55.81894038654918]
Exploration in continuous spaces with sparse rewards is an open problem in reinforcement learning. We introduce an artificial curiosity algorithm based on lower bounding an approximation to the entropy of the state visitation distribution. We show that our approach is both computationally efficient and competitive on benchmarks for exploration in high-dimensional, continuous spaces.
arXiv Detail & Related papers (2022-05-31T09:05:58Z)
Minimax Optimization: The Case of Convex-Submodular [50.03984152441271]
Minimax problems extend beyond the continuous domain to mixed continuous-discrete domains or even fully discrete domains. We introduce the class of convex-submodular minimax problems, where the objective is convex with respect to the continuous variable and submodular with respect to the discrete variable. Our proposed algorithms are iterative and combine tools from both discrete and continuous optimization.
arXiv Detail & Related papers (2021-11-01T21:06:35Z)
Result Diversification by Multi-objective Evolutionary Algorithms with Theoretical Guarantees [94.72461292387146]
We propose to reformulate the result diversification problem as a bi-objective search problem, and solve it by a multi-objective evolutionary algorithm (EA) We theoretically prove that the GSEMO can achieve the optimal-time approximation ratio, $1/2$. When the objective function changes dynamically, the GSEMO can maintain this approximation ratio in running time, addressing the open question proposed by Borodin et al.
arXiv Detail & Related papers (2021-10-18T14:00:22Z)
Exploitation and Exploration Analysis of Elitist Evolutionary Algorithms: A Case Study [6.48717002317456]
This paper proposes to evaluate exploitation and exploration via the success probability and the one-step improvement rate computed in different domains of integration. Case studies are performed by analyzing performances of (1+1) random uni search and (1+1) evolutionary programming on the sphere function and the cheating problem.
arXiv Detail & Related papers (2020-01-29T16:21:39Z)
Dynamic Subgoal-based Exploration via Bayesian Optimization [7.297146495243708]
Reinforcement learning in sparse-reward navigation environments is challenging and poses a need for effective exploration. We propose a cost-aware Bayesian optimization approach that efficiently searches over a class of dynamic subgoal-based exploration strategies. An experimental evaluation demonstrates that the new approach outperforms existing baselines across a number of problem domains.
arXiv Detail & Related papers (2019-10-21T04:24:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.