SAT-MARL: Specification Aware Training in Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2012.07949v1
- Date: Mon, 14 Dec 2020 21:33:16 GMT
- Title: SAT-MARL: Specification Aware Training in Multi-Agent Reinforcement
Learning
- Authors: Fabian Ritz, Thomy Phan, Robert M\"uller, Thomas Gabor, Andreas
Sedlmeier, Marc Zeller, Jan Wieghardt, Reiner Schmid, Horst Sauer, Cornel
Klein, Claudia Linnhoff-Popien
- Abstract summary: In industrial scenarios, a system's behavior needs to be predictable and lie within defined ranges.
This paper proposes to explicitly transfer functional and non-functional requirements into shaped rewards.
Experiments are carried out on the smart factory, a multi-agent environment modeling an industrial lot-size-one production facility.
- Score: 10.82169171060299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A characteristic of reinforcement learning is the ability to develop
unforeseen strategies when solving problems. While such strategies sometimes
yield superior performance, they may also result in undesired or even dangerous
behavior. In industrial scenarios, a system's behavior also needs to be
predictable and lie within defined ranges. To enable the agents to learn (how)
to align with a given specification, this paper proposes to explicitly transfer
functional and non-functional requirements into shaped rewards. Experiments are
carried out on the smart factory, a multi-agent environment modeling an
industrial lot-size-one production facility, with up to eight agents and
different multi-agent reinforcement learning algorithms. Results indicate that
compliance with functional and non-functional constraints can be achieved by
the proposed approach.
Related papers
- APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents [8.479128275067742]
We present an advanced Large Language Model (LLM)-driven framework that enables autonomous agents to construct complex structures in Minecraft.
By employing chain-of-thought decomposition along with multimodal inputs, the framework generates detailed architectural layouts and blueprints.
Our agent incorporates both memory and reflection modules to facilitate lifelong learning, adaptive refinement, and error correction throughout the building process.
arXiv Detail & Related papers (2024-11-26T09:31:28Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Guiding Multi-agent Multi-task Reinforcement Learning by a Hierarchical Framework with Logical Reward Shaping [16.5526277899717]
This study aims to design a multi-agent cooperative algorithm with logic reward shaping.
Experiments have been conducted on various types of tasks in the Minecraft-like environment.
arXiv Detail & Related papers (2024-11-02T09:03:23Z) - Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning [56.82041895921434]
Open-source pre-trained Large Language Models (LLMs) exhibit strong language understanding and generation capabilities.
When used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4.
arXiv Detail & Related papers (2024-03-29T03:48:12Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z) - Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic
Specifications [22.407388715224283]
We propose a novel STL-guided multi-agent reinforcement learning framework.
The STL requirements are designed to include both task specifications according to the objective of each agent and safety specifications, and the values of the STL specifications are leveraged to generate rewards.
arXiv Detail & Related papers (2023-06-11T23:53:29Z) - Latent Policies for Adversarial Imitation Learning [21.105328282702885]
This paper considers learning robot locomotion and manipulation tasks from expert demonstrations.
Generative adversarial imitation learning (GAIL) trains a discriminator that distinguishes expert from agent transitions, and in turn use a reward defined by the discriminator output to optimize a policy generator for the agent.
A key insight of this work is that performing imitation learning in a suitable latent task space makes the training process stable, even in challenging high-dimensional problems.
arXiv Detail & Related papers (2022-06-22T18:06:26Z) - What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm"
We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z) - UPDeT: Universal Multi-agent Reinforcement Learning via Policy
Decoupling with Transformers [108.92194081987967]
We make the first attempt to explore a universal multi-agent reinforcement learning pipeline, designing one single architecture to fit tasks.
Unlike previous RNN-based models, we utilize a transformer-based model to generate a flexible policy.
The proposed model, named as Universal Policy Decoupling Transformer (UPDeT), further relaxes the action restriction and makes the multi-agent task's decision process more explainable.
arXiv Detail & Related papers (2021-01-20T07:24:24Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z) - Individual specialization in multi-task environments with multiagent
reinforcement learners [0.0]
There is a growing interest in Multi-Agent Reinforcement Learning (MARL) as the first steps towards building general intelligent agents.
Previous results point us towards increased conditions for coordination, efficiency/fairness, and common-pool resource sharing.
We further study coordination in multi-task environments where several rewarding tasks can be performed and thus agents don't necessarily need to perform well in all tasks, but under certain conditions may specialize.
arXiv Detail & Related papers (2019-12-29T15:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.