Related papers: Efficient Adaptation of Reinforcement Learning Agents to Sudden Environmental Change

Efficient Adaptation of Reinforcement Learning Agents to Sudden Environmental Change

URL: http://arxiv.org/abs/2505.10330v1
Date: Thu, 15 May 2025 14:19:01 GMT
Title: Efficient Adaptation of Reinforcement Learning Agents to Sudden Environmental Change
Authors: Jonathan Clifford Balloch,
Abstract summary: Real-world autonomous decision-making systems must operate in environments that change over time.<n>Deep reinforcement learning has shown an impressive ability to learn optimal policies in stationary environments.<n>This dissertation demonstrates that efficient online adaptation requires two key capabilities.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-world autonomous decision-making systems, from robots to recommendation engines, must operate in environments that change over time. While deep reinforcement learning (RL) has shown an impressive ability to learn optimal policies in stationary environments, most methods are data intensive and assume a world that does not change between training and test time. As a result, conventional RL methods struggle to adapt when conditions change. This poses a fundamental challenge: how can RL agents efficiently adapt their behavior when encountering novel environmental changes during deployment without catastrophically forgetting useful prior knowledge? This dissertation demonstrates that efficient online adaptation requires two key capabilities: (1) prioritized exploration and sampling strategies that help identify and learn from relevant experiences, and (2) selective preservation of prior knowledge through structured representations that can be updated without disruption to reusable components.

Related papers

Training a Generally Curious Agent [86.84089201249104]
We present PAPRIKA, a fine-tuning approach that enables language models to develop general decision-making capabilities.<n> Experimental results show that models fine-tuned with PAPRIKA can effectively transfer their learned decision-making capabilities to entirely unseen tasks.<n>These results suggest a promising path towards AI systems that can autonomously solve novel sequential decision-making problems.
arXiv Detail & Related papers (2025-02-24T18:56:58Z)
Mind the Gap: Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning [15.619925926862235]
GAP is a generalizable autonomous pentesting framework.<n>It aims to realizes efficient policy training in realistic environments.<n>It also trains agents capable of drawing inferences about other cases from one instance.
arXiv Detail & Related papers (2024-12-05T11:24:27Z)
Learning fast changing slow in spiking neural networks [3.069335774032178]
Reinforcement learning (RL) faces substantial challenges when applied to real-life problems. Life-long learning machines must resolve the plasticity-stability paradox. Striking a balance between acquiring new knowledge and maintaining stability is crucial for artificial agents.
arXiv Detail & Related papers (2024-01-25T12:03:10Z)
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z)
Improving adaptability to new environments and removing catastrophic forgetting in Reinforcement Learning by using an eco-system of agents [3.5786621294068373]
Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment. There is a risk of catastrophic forgetting, where the performance on previously seen environments is seriously hampered. This paper proposes a novel approach that exploits an ecosystem of agents to address both concerns.
arXiv Detail & Related papers (2022-04-13T17:52:54Z)
Reinforcement Learning in Time-Varying Systems: an Empirical Study [10.822467081722152]
We develop a framework for addressing the challenges introduced by non-stationarity. Such agents must explore and learn new environments, without hurting the system's performance. We apply our framework to two systems problems: straggler mitigation and adaptive video streaming.
arXiv Detail & Related papers (2022-01-14T17:04:11Z)
Transfer learning with causal counterfactual reasoning in Decision Transformers [5.672132510411465]
We study the problem of transfer learning under changes in the environment dynamics. Specifically, we use the Decision Transformer architecture to distill a new policy on the new environment. We show that this mechanism can bootstrap a successful policy on the target environment while retaining most of the reward.
arXiv Detail & Related papers (2021-10-27T11:23:27Z)
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL [142.36621929739707]
We show that learning diverse behaviors for accomplishing a task can lead to behavior that generalizes to varying environments. By identifying multiple solutions for the task in a single environment during training, our approach can generalize to new situations.
arXiv Detail & Related papers (2020-10-27T17:41:57Z)
Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents. We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks. This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences. We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning [109.77163932886413]
We show how to adapt vision-based robotic manipulation policies to new variations by fine-tuning via off-policy reinforcement learning. This adaptation uses less than 0.2% of the data necessary to learn the task from scratch. We find that our approach of adapting pre-trained policies leads to substantial performance gains over the course of fine-tuning.
arXiv Detail & Related papers (2020-04-21T17:57:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.