Reinforcement Learning in Time-Varying Systems: an Empirical Study
- URL: http://arxiv.org/abs/2201.05560v1
- Date: Fri, 14 Jan 2022 17:04:11 GMT
- Title: Reinforcement Learning in Time-Varying Systems: an Empirical Study
- Authors: Pouya Hamadanian, Malte Schwarzkopf, Siddartha Sen, Mohammad Alizadeh
- Abstract summary: We develop a framework for addressing the challenges introduced by non-stationarity.
Such agents must explore and learn new environments, without hurting the system's performance.
We apply our framework to two systems problems: straggler mitigation and adaptive video streaming.
- Score: 10.822467081722152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent research has turned to Reinforcement Learning (RL) to solve
challenging decision problems, as an alternative to hand-tuned heuristics. RL
can learn good policies without the need for modeling the environment's
dynamics. Despite this promise, RL remains an impractical solution for many
real-world systems problems. A particularly challenging case occurs when the
environment changes over time, i.e. it exhibits non-stationarity. In this work,
we characterize the challenges introduced by non-stationarity and develop a
framework for addressing them to train RL agents in live systems. Such agents
must explore and learn new environments, without hurting the system's
performance, and remember them over time. To this end, our framework (1)
identifies different environments encountered by the live system, (2) explores
and trains a separate expert policy for each environment, and (3) employs
safeguards to protect the system's performance. We apply our framework to two
systems problems: straggler mitigation and adaptive video streaming, and
evaluate it against a variety of alternative approaches using real-world and
synthetic data. We show that each component of our framework is necessary to
cope with non-stationarity.
Related papers
- Efficient Imitation Learning with Conservative World Models [54.52140201148341]
We tackle the problem of policy learning from expert demonstrations without a reward function.
We re-frame imitation learning as a fine-tuning problem, rather than a pure reinforcement learning one.
arXiv Detail & Related papers (2024-05-21T20:53:18Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - System Design for an Integrated Lifelong Reinforcement Learning Agent
for Real-Time Strategy Games [34.3277278308442]
Continual/lifelong learning (LL) involves minimizing forgetting of old tasks while maximizing a model's capability to learn new tasks.
We introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components.
We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system.
arXiv Detail & Related papers (2022-12-08T23:32:57Z) - Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled
Systems [0.6690874707758508]
Deep Neural Networks (DNNs) have been widely used to perform real-world tasks in cyber-physical systems such as Autonomous Diving Systems (ADS)
Ensuring the correct behavior of such DNN-Enabled Systems (DES) is a crucial topic.
Online testing is one of the promising modes for testing such systems with their application environments (simulated or real) in a closed loop.
We present MORLOT, a novel online testing approach to address these challenges by combining Reinforcement Learning (RL) and many-objective search.
arXiv Detail & Related papers (2022-10-27T13:53:37Z) - Cross apprenticeship learning framework: Properties and solution
approaches [0.880899367147235]
This work consists of an optimization problem where an optimal policy for each environment is sought while ensuring that all policies remain close to one another.
Since the problem is non- convex, we provide a convex outer approximation.
arXiv Detail & Related papers (2022-09-06T11:45:27Z) - Improving adaptability to new environments and removing catastrophic
forgetting in Reinforcement Learning by using an eco-system of agents [3.5786621294068373]
Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment.
There is a risk of catastrophic forgetting, where the performance on previously seen environments is seriously hampered.
This paper proposes a novel approach that exploits an ecosystem of agents to address both concerns.
arXiv Detail & Related papers (2022-04-13T17:52:54Z) - L2Explorer: A Lifelong Reinforcement Learning Assessment Environment [49.40779372040652]
Reinforcement learning solutions tend to generalize poorly when exposed to new tasks outside of the data distribution they are trained on.
We introduce a framework for continual reinforcement-learning development and assessment using Lifelong Learning Explorer (L2Explorer)
L2Explorer is a new, Unity-based, first-person 3D exploration environment that can be continuously reconfigured to generate a range of tasks and task variants structured into complex evaluation curricula.
arXiv Detail & Related papers (2022-03-14T19:20:26Z) - Robust Policy Learning over Multiple Uncertainty Sets [91.67120465453179]
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments.
We develop an algorithm that enjoys the benefits of both system identification and robust RL.
arXiv Detail & Related papers (2022-02-14T20:06:28Z) - Scenic4RL: Programmatic Modeling and Generation of Reinforcement
Learning Environments [89.04823188871906]
Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments.
Most of the existing simulators rely on randomly generating the environments.
We introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers.
arXiv Detail & Related papers (2021-06-18T21:49:46Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.