A Survey of Reinforcement Learning Algorithms for Dynamically Varying
Environments
- URL: http://arxiv.org/abs/2005.10619v1
- Date: Tue, 19 May 2020 09:42:42 GMT
- Title: A Survey of Reinforcement Learning Algorithms for Dynamically Varying
Environments
- Authors: Sindhu Padakandla
- Abstract summary: Reinforcement learning (RL) algorithms find applications in inventory control, recommender systems, vehicular traffic management, cloud computing and robotics.
Real-world complications of many tasks arising in these domains makes them difficult to solve with the basic assumptions underlying classical RL algorithms.
This paper provides a survey of RL methods developed for handling dynamically varying environment models.
A representative collection of these algorithms is discussed in detail in this work along with their categorization and their relative merits and demerits.
- Score: 1.713291434132985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) algorithms find applications in inventory
control, recommender systems, vehicular traffic management, cloud computing and
robotics. The real-world complications of many tasks arising in these domains
makes them difficult to solve with the basic assumptions underlying classical
RL algorithms. RL agents in these applications often need to react and adapt to
changing operating conditions. A significant part of research on single-agent
RL techniques focuses on developing algorithms when the underlying assumption
of stationary environment model is relaxed. This paper provides a survey of RL
methods developed for handling dynamically varying environment models. The goal
of methods not limited by the stationarity assumption is to help autonomous
agents adapt to varying operating conditions. This is possible either by
minimizing the rewards lost during learning by RL agent or by finding a
suitable policy for the RL agent which leads to efficient operation of the
underlying system. A representative collection of these algorithms is discussed
in detail in this work along with their categorization and their relative
merits and demerits. Additionally we also review works which are tailored to
application domains. Finally, we discuss future enhancements for this field.
Related papers
- How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Deep reinforcement learning for machine scheduling: Methodology, the
state-of-the-art, and future directions [2.4541568670428915]
Machine scheduling aims to optimize job assignments to machines while adhering to manufacturing rules and job specifications.
Deep Reinforcement Learning (DRL), a key component of artificial general intelligence, has shown promise in various domains like gaming and robotics.
This paper offers a comprehensive review and comparison of DRL-based approaches, highlighting their methodology, applications, advantages, and limitations.
arXiv Detail & Related papers (2023-10-04T22:45:09Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Math Programming based Reinforcement Learning for Multi-Echelon
Inventory Management [1.9161790404101895]
Reinforcement learning has lead to considerable break-throughs in diverse areas such as robotics, games and many others.
But the application to RL in complex real-world decision making problems remains limited.
These characteristics make the problem considerably harder to solve for existing RL methods that rely on enumeration techniques to solve per step action problems.
We show that a properly selected discretization of the underlying uncertain distribution can yield near optimal actor policy even with very few samples from the underlying uncertainty.
We find that PARL outperforms commonly used base stock by 44.7% and the best performing RL method by up to 12.1% on average
arXiv Detail & Related papers (2021-12-04T01:40:34Z) - Reinforcement Learning with Algorithms from Probabilistic Structure
Estimation [9.37335587960084]
Reinforcement learning algorithms aim to learn optimal decisions in unknown environments.
It is unknown from the outset whether or not the agent's actions will impact the environment.
It is often not possible to determine which RL algorithm is most fitting.
arXiv Detail & Related papers (2021-03-15T09:51:34Z) - Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules.
This paper introduces a new meta-learning approach that discovers an entire update rule.
It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z) - Deep Reinforcement Learning for Autonomous Driving: A Survey [0.3694429692322631]
This review summarises deep reinforcement learning (DRL) algorithms and provides a taxonomy of automated driving tasks.
It also delineates adjacent domains such as behavior cloning, imitation learning, inverse reinforcement learning that are related but are not classical RL algorithms.
The role of simulators in training agents, methods to validate, test and robustify existing solutions in RL are discussed.
arXiv Detail & Related papers (2020-02-02T18:21:22Z) - Reinforcement Learning-based Application Autoscaling in the Cloud: A
Survey [2.9751538760825085]
Reinforcement Learning (RL) has demonstrated a great potential for automatically solving decision-making problems in complex uncertain environments.
It is possible to learn transparent (with no human intervention), dynamic (no static plans), and adaptable (constantly updated) resource management policies to execute applications.
It exploits the Cloud elasticity to optimize the execution of applications according to given optimization criteria.
arXiv Detail & Related papers (2020-01-27T18:23:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.