Cell-Free Latent Go-Explore
- URL: http://arxiv.org/abs/2208.14928v3
- Date: Thu, 27 Apr 2023 09:40:25 GMT
- Title: Cell-Free Latent Go-Explore
- Authors: Quentin Gallou\'edec and Emmanuel Dellandr\'ea
- Abstract summary: Latent Go-Explore (LGE) is a simple and general approach based on the Go-Explore paradigm for exploration in reinforcement learning (RL)
We show that LGE can be flexibly combined with any strategy for learning a latent representation.
Our results indicate that LGE, although simpler than Go-Explore, is more robust and outperforms state-of-the-art algorithms in terms of pure exploration.
- Score: 3.1868913341776106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce Latent Go-Explore (LGE), a simple and general
approach based on the Go-Explore paradigm for exploration in reinforcement
learning (RL). Go-Explore was initially introduced with a strong domain
knowledge constraint for partitioning the state space into cells. However, in
most real-world scenarios, drawing domain knowledge from raw observations is
complex and tedious. If the cell partitioning is not informative enough,
Go-Explore can completely fail to explore the environment. We argue that the
Go-Explore approach can be generalized to any environment without domain
knowledge and without cells by exploiting a learned latent representation.
Thus, we show that LGE can be flexibly combined with any strategy for learning
a latent representation. Our results indicate that LGE, although simpler than
Go-Explore, is more robust and outperforms state-of-the-art algorithms in terms
of pure exploration on multiple hard-exploration environments including
Montezuma's Revenge. The LGE implementation is available as open-source at
https://github.com/qgallouedec/lge.
Related papers
- AI-native Memory: A Pathway from LLMs Towards AGI [25.19572633670963]
Large language models (LLMs) have demonstrated the world with the sparks of artificial general intelligence (AGI)
We envision a pathway from LLMs to AGI through the integration of emphmemory.
As an intermediate stage, the memory will likely be in the form of natural language descriptions.
arXiv Detail & Related papers (2024-06-26T12:51:37Z) - Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models [5.404186221463082]
Go-Explore is a powerful family of algorithms designed to solve hard-exploration problems.
We propose Intelligent Go-Explore (IGE) which greatly extends the scope of the original Go-Explore.
IGE has a human-like ability to instinctively identify how interesting or promising any new state is.
arXiv Detail & Related papers (2024-05-24T01:45:27Z) - Can large language models explore in-context? [87.49311128190143]
We deploy Large Language Models as agents in simple multi-armed bandit environments.
We find that the models do not robustly engage in exploration without substantial interventions.
arXiv Detail & Related papers (2024-03-22T17:50:43Z) - METRA: Scalable Unsupervised RL with Metric-Aware Abstraction [69.90741082762646]
Metric-Aware Abstraction (METRA) is a novel unsupervised reinforcement learning objective.
By learning to move in every direction in the latent space, METRA obtains a tractable set of diverse behaviors.
We show that METRA can discover a variety of useful behaviors even in complex, pixel-based environments.
arXiv Detail & Related papers (2023-10-13T06:43:11Z) - Exploring the Potential of Large Language Models (LLMs) in Learning on
Graphs [59.74814230246034]
Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities.
We investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors.
arXiv Detail & Related papers (2023-07-07T05:31:31Z) - Efficient GNN Explanation via Learning Removal-based Attribution [56.18049062940675]
We propose a framework of GNN explanation named LeArn Removal-based Attribution (LARA) to address this problem.
The explainer in LARA learns to generate removal-based attribution which enables providing explanations with high fidelity.
In particular, LARA is 3.5 times faster and achieves higher fidelity than the state-of-the-art method on the large dataset ogbn-arxiv.
arXiv Detail & Related papers (2023-06-09T08:54:20Z) - Time-Myopic Go-Explore: Learning A State Representation for the
Go-Explore Paradigm [0.5156484100374059]
We introduce a novel time-myopic state representation that clusters temporal close states together.
We demonstrate the first learned state representation that reliably estimates novelty instead of using the hand-crafted representation.
Our approach is evaluated on the hard exploration environments MontezumaRevenge, Gravitar and Frostbite (Atari)
arXiv Detail & Related papers (2023-01-13T16:13:44Z) - First Go, then Post-Explore: the Benefits of Post-Exploration in
Intrinsic Motivation [7.021281655855703]
Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards.
Key insight of Go-Explore was that successful exploration requires an agent to first return to an interesting state.
We refer to such exploration after a goal is reached as 'post-exploration'
arXiv Detail & Related papers (2022-12-06T18:56:47Z) - BYOL-Explore: Exploration by Bootstrapped Prediction [49.221173336814225]
BYOL-Explore is a conceptually simple yet general approach for curiosity-driven exploration in visually-complex environments.
We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark.
arXiv Detail & Related papers (2022-06-16T17:36:15Z) - Long-Term Exploration in Persistent MDPs [68.8204255655161]
We propose an exploration method called Rollback-Explore (RbExplore)
In this paper, we propose an exploration method called Rollback-Explore (RbExplore), which utilizes the concept of the persistent Markov decision process.
We test our algorithm in the hard-exploration Prince of Persia game, without rewards and domain knowledge.
arXiv Detail & Related papers (2021-09-21T13:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.