From STL Rulebooks to Rewards
- URL: http://arxiv.org/abs/2110.02792v1
- Date: Wed, 6 Oct 2021 14:16:59 GMT
- Title: From STL Rulebooks to Rewards
- Authors: Edgar A. Aguilar, Luigi Berducci, Axel Brunnbauer, Radu Grosu, Dejan
Ni\v{c}kovi\'c
- Abstract summary: We propose a principled approach to shaping rewards for reinforcement learning from multiple objectives.
We first equip STL with a novel quantitative semantics allowing to automatically evaluate individual requirements.
We then develop a method for systematically combining evaluations of multiple requirements into a single reward.
- Score: 4.859570041295978
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The automatic synthesis of neural-network controllers for autonomous agents
through reinforcement learning has to simultaneously optimize many, possibly
conflicting, objectives of various importance. This multi-objective
optimization task is reflected in the shape of the reward function, which is
most often the result of an ad-hoc and crafty-like activity.
In this paper we propose a principled approach to shaping rewards for
reinforcement learning from multiple objectives that are given as a
partially-ordered set of signal-temporal-logic (STL) rules. To this end, we
first equip STL with a novel quantitative semantics allowing to automatically
evaluate individual requirements. We then develop a method for systematically
combining evaluations of multiple requirements into a single reward that takes
into account the priorities defined by the partial order. We finally evaluate
our approach on several case studies, demonstrating its practical
applicability.
Related papers
- MORL-Prompt: An Empirical Analysis of Multi-Objective Reinforcement
Learning for Discrete Prompt Optimization [49.60729578316884]
RL-based techniques can be used to search for prompts that maximize a set of user-specified reward functions.
Current techniques focus on maximizing the average of reward functions, which does not necessarily lead to prompts that achieve balance across rewards.
In this paper, we adapt several techniques for multi-objective optimization to RL-based discrete prompt optimization.
arXiv Detail & Related papers (2024-02-18T21:25:09Z) - Generalizing LTL Instructions via Future Dependent Options [7.8578244861940725]
This paper proposes a novel multi-task algorithm with improved learning efficiency and optimality.
In order to propagate the rewards of satisfying future subgoals back more efficiently, we propose to train a multi-step function conditioned on the subgoal sequence.
In experiments on three different domains, we evaluate the generalization capability of the agent trained by the proposed algorithm.
arXiv Detail & Related papers (2022-12-08T21:44:18Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Effective Adaptation in Multi-Task Co-Training for Unified Autonomous
Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks.
We find that their performances are sub-optimal or even lag far behind the single-task baseline.
We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - A Distributional View on Multi-Objective Policy Optimization [24.690800846837273]
We propose an algorithm for multi-objective reinforcement learning that enables setting desired preferences for objectives in a scale-invariant way.
We show that setting different preferences in our framework allows us to trace out the space of nondominated solutions.
arXiv Detail & Related papers (2020-05-15T13:02:17Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.