Replay Buffer with Local Forgetting for Adapting to Local Environment
  Changes in Deep Model-Based Reinforcement Learning
        - URL: http://arxiv.org/abs/2303.08690v2
- Date: Wed, 27 Sep 2023 16:45:15 GMT
- Title: Replay Buffer with Local Forgetting for Adapting to Local Environment
  Changes in Deep Model-Based Reinforcement Learning
- Authors: Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Harm van
  Seijen, Sarath Chandar
- Abstract summary: We show that a simple variation of the first-in-first-out replay buffer is able to overcome the limitation of a replay buffer.
We demonstrate this by applying our replay-buffer variation to a deep version of the classical Dyna method.
- Score: 20.92599229976769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   One of the key behavioral characteristics used in neuroscience to determine
whether the subject of study -- be it a rodent or a human -- exhibits
model-based learning is effective adaptation to local changes in the
environment, a particular form of adaptivity that is the focus of this work. In
reinforcement learning, however, recent work has shown that modern deep
model-based reinforcement-learning (MBRL) methods adapt poorly to local
environment changes. An explanation for this mismatch is that MBRL methods are
typically designed with sample-efficiency on a single task in mind and the
requirements for effective adaptation are substantially higher, both in terms
of the learned world model and the planning routine. One particularly
challenging requirement is that the learned world model has to be sufficiently
accurate throughout relevant parts of the state-space. This is challenging for
deep-learning-based world models due to catastrophic forgetting. And while a
replay buffer can mitigate the effects of catastrophic forgetting, the
traditional first-in-first-out replay buffer precludes effective adaptation due
to maintaining stale data. In this work, we show that a conceptually simple
variation of this traditional replay buffer is able to overcome this
limitation. By removing only samples from the buffer from the local
neighbourhood of the newly observed samples, deep world models can be built
that maintain their accuracy across the state-space, while also being able to
effectively adapt to local changes in the reward function. We demonstrate this
by applying our replay-buffer variation to a deep version of the classical Dyna
method, as well as to recent methods such as PlaNet and DreamerV2,
demonstrating that deep model-based methods can adapt effectively as well to
local changes in the environment.
 
      
        Related papers
        - LifelongPR: Lifelong knowledge fusion for point cloud place recognition   based on replay and prompt learning [15.464706470200337]
 Point cloud place recognition (PCPR) plays a crucial role in photogrammetry and robotics applications.<n>Existing PCPR models often suffer from catastrophic forgetting, leading to significant performance degradation.<n>We propose LifelongPR, a novel continual learning framework for PCPR, which effectively extracts and fuses knowledge from sequential point cloud data.
 arXiv  Detail & Related papers  (2025-07-14T08:13:33Z)
- Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for   Continual Test-time Adaptation [67.80294336559574]
 Continual Test Time Adaptation (CTTA) is a task that requires a source pre-trained model to continually adapt to new scenarios.<n>We propose a novel pipeline, Orthogonal Projection Subspace to aggregate online Prior-knowledge, dubbed OoPk.
 arXiv  Detail & Related papers  (2025-06-23T18:17:39Z)
- 5G-DIL: Domain Incremental Learning with Similarity-Aware Sampling for   Dynamic 5G Indoor Localization [4.63911391947225]
 This paper introduces a domain incremental learning (DIL) approach for dynamic 5G indoor localization, called 5G-DIL, enabling rapid adaptation to environmental changes.<n>We present a novel similarity-aware sampling technique based on the Chebyshev distance, designed to efficiently select specific exemplars from the previous environment.<n>Our approach is adaptable to real-world non-line-of-sight propagation scenarios and achieves an MAE positioning error of 0.261 meters, even under dynamic environmental conditions.
 arXiv  Detail & Related papers  (2025-05-23T09:54:58Z)
- Replay to Remember: Retaining Domain Knowledge in Streaming Language   Models [0.0]
 Continual learning in large language models (LLMs) typically encounters the critical challenge of catastrophic forgetting.
We demonstrate a method combining LoRA and a minimal replay mechanism in a realistic streaming setting.
Our experiments reveal that while catastrophic forgetting naturally occurs, even minimal replay significantly stabilizes and partially restores domain-specific knowledge.
 arXiv  Detail & Related papers  (2025-04-24T17:56:22Z)
- SPARTAN: A Sparse Transformer Learning Local Causation [63.29645501232935]
 Causal structures play a central role in world models that flexibly adapt to changes in the environment.
We present the SPARse TrANsformer World model (SPARTAN), a Transformer-based world model that learns local causal structures between entities in a scene.
By applying sparsity regularisation on the attention pattern between object-factored tokens, SPARTAN identifies sparse local causal models that accurately predict future object states.
 arXiv  Detail & Related papers  (2024-11-11T11:42:48Z)
- Towards Interpretable Deep Local Learning with Successive Gradient   Reconciliation [70.43845294145714]
 Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic.
We propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules.
Our method can be integrated into both local-BP and BP-free settings.
 arXiv  Detail & Related papers  (2024-06-07T19:10:31Z)
- Partial Models for Building Adaptive Model-Based Reinforcement Learning   Agents [37.604622216020765]
 We show that the conceptually simple idea of partial models can allow deep model-based agents to overcome this challenge.
We demonstrate this by showing that the use of partial models in agents such as deep Dyna-Q, PlaNet and Dreamer can allow for them to effectively adapt to the local changes in their environments.
 arXiv  Detail & Related papers  (2024-05-27T07:46:36Z)
- Towards Seamless Adaptation of Pre-trained Models for Visual Place   Recognition [72.35438297011176]
 We propose a novel method to realize seamless adaptation of pre-trained models for visual place recognition (VPR)
Specifically, to obtain both global and local features that focus on salient landmarks for discriminating places, we design a hybrid adaptation method.
 Experimental results show that our method outperforms the state-of-the-art methods with less training data and training time.
 arXiv  Detail & Related papers  (2024-02-22T12:55:01Z)
- Towards Continual Learning Desiderata via HSIC-Bottleneck
  Orthogonalization and Equiangular Embedding [55.107555305760954]
 We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
 arXiv  Detail & Related papers  (2024-01-17T09:01:29Z)
- Normalization Perturbation: A Simple Domain Generalization Method for
  Real-World Domain Shifts [133.99270341855728]
 Real-world domain styles can vary substantially due to environment changes and sensor noises.
Deep models only know the training domain style.
We propose Normalization Perturbation to overcome this domain style overfitting problem.
 arXiv  Detail & Related papers  (2022-11-08T17:36:49Z)
- PointFix: Learning to Fix Domain Bias for Robust Online Stereo
  Adaptation [67.41325356479229]
 We propose to incorporate an auxiliary point-selective network into a meta-learning framework, called PointFix.
In a nutshell, our auxiliary network learns to fix local variants intensively by effectively back-propagating local information through the meta-gradient.
This network is model-agnostic, so can be used in any kind of architectures in a plug-and-play manner.
 arXiv  Detail & Related papers  (2022-07-27T07:48:29Z)
- Towards Evaluating Adaptivity of Model-Based Reinforcement Learning
  Methods [25.05409184943328]
 We show that well-known model-based methods perform poorly in their ability to adapt to local environmental changes.
We identify elements that hurt adaptive behavior and link these to underlying techniques frequently used in deep model-based RL.
We provide insights into the challenges of building an adaptive nonlinear model-based method.
 arXiv  Detail & Related papers  (2022-04-25T06:45:16Z)
- Learning Neural Models for Natural Language Processing in the Face of
  Distributional Shift [10.990447273771592]
 The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications.
It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time.
This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information.
It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
 arXiv  Detail & Related papers  (2021-09-03T14:29:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.