Sparse Reward Exploration via Novelty Search and Emitters
- URL: http://arxiv.org/abs/2102.03140v1
- Date: Fri, 5 Feb 2021 12:34:54 GMT
- Title: Sparse Reward Exploration via Novelty Search and Emitters
- Authors: Giuseppe Paolo (1 and 2), Alexandre Coninx (1), Stephane Doncieux (1),
Alban Laflaqui\`ere (2) ((1) ISIR, (2) SBRE)
- Abstract summary: We introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm.
SERENE separates the search space exploration and reward exploitation into two alternating processes.
A meta-scheduler allocates a global computational budget by alternating between the two processes.
- Score: 55.41644538483948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reward-based optimization algorithms require both exploration, to find
rewards, and exploitation, to maximize performance. The need for efficient
exploration is even more significant in sparse reward settings, in which
performance feedback is given sparingly, thus rendering it unsuitable for
guiding the search process. In this work, we introduce the SparsE Reward
Exploration via Novelty and Emitters (SERENE) algorithm, capable of efficiently
exploring a search space, as well as optimizing rewards found in potentially
disparate areas. Contrary to existing emitters-based approaches, SERENE
separates the search space exploration and reward exploitation into two
alternating processes. The first process performs exploration through Novelty
Search, a divergent search algorithm. The second one exploits discovered reward
areas through emitters, i.e. local instances of population-based optimization
algorithms. A meta-scheduler allocates a global computational budget by
alternating between the two processes, ensuring the discovery and efficient
exploitation of disjoint reward areas. SERENE returns both a collection of
diverse solutions covering the search space and a collection of high-performing
solutions for each distinct reward area. We evaluate SERENE on various sparse
reward environments and show it compares favorably to existing baselines.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.