Related papers: On the impact of MDP design for Reinforcement Learning agents in Resource Management

On the impact of MDP design for Reinforcement Learning agents in Resource Management

URL: http://arxiv.org/abs/2109.03202v1
Date: Tue, 7 Sep 2021 17:13:11 GMT
Title: On the impact of MDP design for Reinforcement Learning agents in Resource Management
Authors: Renato Luiz de Freitas Cunha, Luiz Chaimowicz
Abstract summary: We compare and contrast four different MDP variations, discussing their computational requirements and impacts on agent performance. We conclude by showing that, when using Multi-Layer Perceptrons as approximation function, a compact state representation allows transfer of agents between environments.
Score: 0.8223798883838329
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The recent progress in Reinforcement Learning applications to Resource Management presents MDPs without a deeper analysis of the impacts of design decisions on agent performance. In this paper, we compare and contrast four different MDP variations, discussing their computational requirements and impacts on agent performance by means of an empirical analysis. We conclude by showing that, in our experiments, when using Multi-Layer Perceptrons as approximation function, a compact state representation allows transfer of agents between environments, and that transferred agents have good performance and outperform specialized agents in 80\% of the tested scenarios, even without retraining.

Related papers

From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection. Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z)
Towards Cost Sensitive Decision Making [14.279123976398926]
In this work, we consider RL models that may actively acquire features from the environment to improve the decision quality and certainty. We propose the Active-Acquisition POMDP and identify two types of the acquisition process for different application domains. In order to assist the agent in the actively-acquired partially-observed environment and alleviate the exploration-exploitation dilemma, we develop a model-based approach.
arXiv Detail & Related papers (2024-10-04T19:48:23Z)
Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement [50.481380478458945]
Iterative step-level Process Refinement (IPR) framework provides detailed step-by-step guidance to enhance agent training. Our experiments on three complex agent tasks demonstrate that our framework outperforms a variety of strong baselines.
arXiv Detail & Related papers (2024-06-17T03:29:13Z)
Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL) The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario. We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z)
Explaining Reinforcement Learning Policies through Counterfactual Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time. Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution. In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z)
What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm" We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z)
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs [47.73837217824527]
We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience. Our main contribution is to introduce the Deep Averagers with Costs MDP (DAC-MDP) and to investigate its solutions for offline RL.
arXiv Detail & Related papers (2020-10-18T00:11:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.