Curious Replay for Model-based Adaptation
        - URL: http://arxiv.org/abs/2306.15934v1
- Date: Wed, 28 Jun 2023 05:34:53 GMT
- Title: Curious Replay for Model-based Adaptation
- Authors: Isaac Kauvar, Chris Doyle, Linqi Zhou, Nick Haber
- Abstract summary: We present Curious Replay, a form of prioritized experience replay tailored to model-based agents.
Agents using Curious Replay exhibit improved performance in an exploration paradigm inspired by animal behavior.
DreamerV3 with Curious Replay surpasses state-of-the-art performance on the Crafter benchmark.
- Score: 3.9981390090442686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Agents must be able to adapt quickly as an environment changes. We find that
existing model-based reinforcement learning agents are unable to do this well,
in part because of how they use past experiences to train their world model.
Here, we present Curious Replay -- a form of prioritized experience replay
tailored to model-based agents through use of a curiosity-based priority
signal. Agents using Curious Replay exhibit improved performance in an
exploration paradigm inspired by animal behavior and on the Crafter benchmark.
DreamerV3 with Curious Replay surpasses state-of-the-art performance on
Crafter, achieving a mean score of 19.4 that substantially improves on the
previous high score of 14.5 by DreamerV3 with uniform replay, while also
maintaining similar performance on the Deepmind Control Suite. Code for Curious
Replay is available at https://github.com/AutonomousAgentsLab/curiousreplay
 
      
        Related papers
        - Replay Can Provably Increase Forgetting [24.538643224479515]
 A critical challenge for continual learning is forgetting, where the performance on previously learned tasks decreases as new tasks are introduced.<n>One of the commonly used techniques to mitigate forgetting, sample replay, has been shown empirically to reduce forgetting.<n>We show that even in a noiseless setting, forgetting can be non-monotonic with respect to the number of replay samples.
 arXiv  Detail & Related papers  (2025-06-04T18:46:23Z)
- Prioritized Generative Replay [121.83947140497655]
 We propose a prioritized, parametric version of an agent's memory, using generative models to capture online experience.
This paradigm enables densification of past experience, with new generations that benefit from the generative model's generalization capacity.
We show this recipe can be instantiated using conditional diffusion models and simple relevance functions.
 arXiv  Detail & Related papers  (2024-10-23T17:59:52Z)
- Brain-Like Replay Naturally Emerges in Reinforcement Learning Agents [3.9276584971242303]
 We develop a modular reinforcement learning model that could generate replay.
We prove that replay generated in this way helps complete the task.
Our design avoids complex assumptions and enables replay to emerge naturally within a task-optimized paradigm.
 arXiv  Detail & Related papers  (2024-02-02T14:55:51Z)
- Adiabatic replay for continual learning [138.7878582237908]
 generative replay spends an increasing amount of time just re-learning what is already known.
We propose a replay-based CL strategy that we term adiabatic replay (AR)
We verify experimentally that AR is superior to state-of-the-art deep generative replay using VAEs.
 arXiv  Detail & Related papers  (2023-03-23T10:18:06Z)
- Model-Free Generative Replay for Lifelong Reinforcement Learning:
  Application to Starcraft-2 [5.239932780277599]
 Generative replay (GR) is a biologically-inspired replay mechanism that augments learning experiences with self-labelled examples.
We present a version of GR for LRL that satisfies two desideratas: (a) Introspective density modelling of the latent representations of policies learned using deep RL, and (b) Model-free end-to-end learning.
 arXiv  Detail & Related papers  (2022-08-09T22:00:28Z)
- Replay For Safety [51.11953997546418]
 In experience replay, past transitions are stored in a memory buffer and re-used during learning.
We show that using an appropriate biased sampling scheme can allow us to achieve a emphsafe policy.
 arXiv  Detail & Related papers  (2021-12-08T11:10:57Z)
- An Empirical Study on the Generalization Power of Neural Representations
  Learned via Visual Guessing Games [79.23847247132345]
 This work investigates how well an artificial agent can benefit from playing guessing games when later asked to perform on novel NLP downstream tasks such as Visual Question Answering (VQA)
We propose two ways to exploit playing guessing games: 1) a supervised learning scenario in which the agent learns to mimic successful guessing games and 2) a novel way for an agent to play by itself, called Self-play via Iterated Experience Learning (SPIEL)
 arXiv  Detail & Related papers  (2021-01-31T10:30:48Z)
- Lucid Dreaming for Experience Replay: Refreshing Past States with the
  Current Policy [48.8675653453076]
 We introduce Lucid Dreaming for Experience Replay (LiDER), a framework that allows replay experiences to be refreshed by leveraging the agent's current policy.
LiDER consistently improves performance over the baseline in six Atari 2600 games.
 arXiv  Detail & Related papers  (2020-09-29T02:54:11Z)
- Revisiting Fundamentals of Experience Replay [91.24213515992595]
 We present a systematic and extensive analysis of experience replay in Q-learning methods.
We focus on two fundamental properties: the replay capacity and the ratio of learning updates to experience collected.
 arXiv  Detail & Related papers  (2020-07-13T21:22:17Z)
- Double Prioritized State Recycled Experience Replay [3.42658286826597]
 We develop a method called double-prioritized state-recycled (DPSR) experience replay.
We used this method in Deep Q-Networks (DQN), and achieved a state-of-the-art result.
 arXiv  Detail & Related papers  (2020-07-08T08:36:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.