Imagination-Augmented Hierarchical Reinforcement Learning for Safe and
Interactive Autonomous Driving in Urban Environments
- URL: http://arxiv.org/abs/2311.10309v2
- Date: Tue, 23 Jan 2024 06:03:10 GMT
- Title: Imagination-Augmented Hierarchical Reinforcement Learning for Safe and
Interactive Autonomous Driving in Urban Environments
- Authors: Sang-Hyun Lee, Yoonjae Jung, Seung-Woo Seo
- Abstract summary: Hierarchical reinforcement learning (HRL) incorporates temporal abstraction into reinforcement learning (RL)
We propose imagination-augmented HRL (IAHRL) that efficiently integrates imagination into HRL.
IAHRL enables an agent to perform safe and interactive behaviors, achieving higher success rates and lower average episode steps than baselines.
- Score: 21.30432408940134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hierarchical reinforcement learning (HRL) incorporates temporal abstraction
into reinforcement learning (RL) by explicitly taking advantage of hierarchical
structure. Modern HRL typically designs a hierarchical agent composed of a
high-level policy and low-level policies. The high-level policy selects which
low-level policy to activate at a lower frequency and the activated low-level
policy selects an action at each time step. Recent HRL algorithms have achieved
performance gains over standard RL algorithms in synthetic navigation tasks.
However, we cannot apply these HRL algorithms to real-world navigation tasks.
One of the main challenges is that real-world navigation tasks require an agent
to perform safe and interactive behaviors in dynamic environments. In this
paper, we propose imagination-augmented HRL (IAHRL) that efficiently integrates
imagination into HRL to enable an agent to learn safe and interactive behaviors
in real-world navigation tasks. Imagination is to predict the consequences of
actions without interactions with actual environments. The key idea behind
IAHRL is that the low-level policies imagine safe and structured behaviors, and
then the high-level policy infers interactions with surrounding objects by
interpreting the imagined behaviors. We also introduce a new attention
mechanism that allows our high-level policy to be permutation-invariant to the
order of surrounding objects and to prioritize our agent over them. To evaluate
IAHRL, we introduce five complex urban driving tasks, which are among the most
challenging real-world navigation tasks. The experimental results indicate that
IAHRL enables an agent to perform safe and interactive behaviors, achieving
higher success rates and lower average episode steps than baselines.
Related papers
- Meta-Learning Integration in Hierarchical Reinforcement Learning for Advanced Task Complexity [0.0]
Hierarchical Reinforcement Learning (HRL) effectively tackles complex tasks by decomposing them into structured policies.
We integrate meta-learning into HRL to enhance the agent's ability to learn and adapt hierarchical policies swiftly.
arXiv Detail & Related papers (2024-10-10T13:47:37Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - SHIRO: Soft Hierarchical Reinforcement Learning [0.0]
We present an Off-Policy HRL algorithm that maximizes entropy for efficient exploration.
The algorithm learns a temporally abstracted low-level policy and is able to explore broadly through the addition of entropy to the high-level.
Our method, SHIRO, surpasses state-of-the-art performance on a range of simulated robotic control benchmark tasks.
arXiv Detail & Related papers (2022-12-24T17:21:58Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - General policy mapping: online continual reinforcement learning inspired
on the insect brain [3.8937756915387505]
We have developed a model for online continual or lifelong reinforcement learning inspired on the insect brain.
Our model leverages the offline training of a feature extraction and a common general policy layer to enable the convergence of RL algorithms in online settings.
arXiv Detail & Related papers (2022-11-30T05:54:19Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Room Clearance with Feudal Hierarchical Reinforcement Learning [2.867517731896504]
We introduce a new simulation environment, "it", designed as a tool to build scenarios that can drive RL research in a direction useful for military analysis.
We focus on an abstracted and simplified room clearance scenario, where a team of blue agents have to make their way through a building and ensure that all rooms are cleared of enemy red agents.
We implement a multi-agent version of feudal hierarchical RL that introduces a command hierarchy where a commander at the higher level sends orders to multiple agents at the lower level who simply have to learn to follow these orders.
We find that breaking the task down in this way allows us to
arXiv Detail & Related papers (2021-05-24T15:05:58Z) - A Survey of Reinforcement Learning Algorithms for Dynamically Varying
Environments [1.713291434132985]
Reinforcement learning (RL) algorithms find applications in inventory control, recommender systems, vehicular traffic management, cloud computing and robotics.
Real-world complications of many tasks arising in these domains makes them difficult to solve with the basic assumptions underlying classical RL algorithms.
This paper provides a survey of RL methods developed for handling dynamically varying environment models.
A representative collection of these algorithms is discussed in detail in this work along with their categorization and their relative merits and demerits.
arXiv Detail & Related papers (2020-05-19T09:42:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.