Informed Reinforcement Learning for Situation-Aware Traffic Rule Exceptions
- URL: http://arxiv.org/abs/2402.04168v2
- Date: Wed, 12 Jun 2024 11:34:30 GMT
- Title: Informed Reinforcement Learning for Situation-Aware Traffic Rule Exceptions
- Authors: Daniel Bogdoll, Jing Qin, Moritz Nekolla, Ahmed Abouelazm, Tim Joseph, J. Marius Zöllner,
- Abstract summary: In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is integrated as a knowledge source.
We learn trajectories and asses them with a situation-aware reward design, leading to a dynamic reward which allows the agent to learn situations which require controlled traffic rule exceptions.
- Score: 22.305075467333673
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement Learning is a highly active research field with promising advancements. In the field of autonomous driving, however, often very simple scenarios are being examined. Common approaches use non-interpretable control commands as the action space and unstructured reward designs which lack structure. In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is integrated as a knowledge source. We learn trajectories and asses them with a situation-aware reward design, leading to a dynamic reward which allows the agent to learn situations which require controlled traffic rule exceptions. Our method is applicable to arbitrary RL models. We successfully demonstrate high completion rates of complex scenarios with recent model-based agents.
Related papers
- Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL)
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z) - Efficient Open-world Reinforcement Learning via Knowledge Distillation
and Autonomous Rule Discovery [5.680463564655267]
Rule-driven deep Q-learning agent (RDQ) as one possible implementation of framework.
We show that RDQ successfully extracts task-specific rules as it interacts with the world.
In experiments, we show that the RDQ agent is significantly more resilient to the novelties than the baseline agents.
arXiv Detail & Related papers (2023-11-24T04:12:50Z) - Improving Long-Horizon Imitation Through Instruction Prediction [93.47416552953075]
In this work, we explore the use of an often unused source of auxiliary supervision: language.
Inspired by recent advances in transformer-based models, we train agents with an instruction prediction loss that encourages learning temporally extended representations that operate at a high level of abstraction.
In further analysis we find that instruction modeling is most important for tasks that require complex reasoning, while understandably offering smaller gains in environments that require simple plans.
arXiv Detail & Related papers (2023-06-21T20:47:23Z) - Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios [9.761912672523977]
We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy.
We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
arXiv Detail & Related papers (2022-04-11T15:34:49Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - On the Theory of Reinforcement Learning with Once-per-Episode Feedback [120.5537226120512]
We introduce a theory of reinforcement learning in which the learner receives feedback only once at the end of an episode.
This is arguably more representative of real-world applications than the traditional requirement that the learner receive feedback at every time step.
arXiv Detail & Related papers (2021-05-29T19:48:51Z) - Affordance-based Reinforcement Learning for Urban Driving [3.507764811554557]
We propose a deep reinforcement learning framework to learn optimal control policy using waypoints and low-dimensional visual representations.
We demonstrate that our agents when trained from scratch learn the tasks of lane-following, driving around inter-sections as well as stopping in front of other actors or traffic lights even in the dense traffic setting.
arXiv Detail & Related papers (2021-01-15T05:21:25Z) - Semi-supervised reward learning for offline reinforcement learning [71.6909757718301]
Training agents usually requires reward functions, but rewards are seldom available in practice and their engineering is challenging and laborious.
We propose semi-supervised learning algorithms that learn from limited annotations and incorporate unlabelled data.
In our experiments with a simulated robotic arm, we greatly improve upon behavioural cloning and closely approach the performance achieved with ground truth rewards.
arXiv Detail & Related papers (2020-12-12T20:06:15Z) - NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control [15.720231070808696]
In this paper, we propose a novel hierarchical reinforcement learning framework without explicit action.
Our meta policy tries to manipulate the next optimal state and actual action is produced by the inverse dynamics model.
Under our framework, widely available state-only demonstrations can be exploited effectively for imitation learning.
arXiv Detail & Related papers (2020-11-02T15:28:19Z) - Interpretable End-to-end Urban Autonomous Driving with Latent Deep
Reinforcement Learning [32.97789225998642]
We propose an interpretable deep reinforcement learning method for end-to-end autonomous driving.
A sequential latent environment model is introduced and learned jointly with the reinforcement learning process.
Our method is able to provide a better explanation of how the car reasons about the driving environment.
arXiv Detail & Related papers (2020-01-23T18:36:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.