Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning
- URL: http://arxiv.org/abs/2203.10823v1
- Date: Mon, 21 Mar 2022 09:16:56 GMT
- Title: Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning
- Authors: Marc R. Schlichting, Stefan Notter, and Walter Fichter
- Abstract summary: Reinforcement learning is used to train a policy network that accommodates desirable path planning behaviors.
A Long Short-Term Memory module is proposed to encode an unspecified number of states for a varying, indefinite number of agents.
The proposed approach is validated by presenting flight test results of up to four drones, autonomously navigating collision-free in a real-world environment.
- Score: 0.34410212782758043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning-based path planning for multi-agent systems of varying
size constitutes a research topic with increasing significance as progress in
domains such as urban air mobility and autonomous aerial vehicles continues.
Reinforcement learning with continuous state and action spaces is used to train
a policy network that accommodates desirable path planning behaviors and can be
used for time-critical applications. A Long Short-Term Memory module is
proposed to encode an unspecified number of states for a varying, indefinite
number of agents. The described training strategies and policy architecture
lead to a guidance that scales to an infinite number of agents and unlimited
physical dimensions, although training takes place at a smaller scale. The
guidance is implemented on a low-cost, off-the-shelf onboard computer. The
feasibility of the proposed approach is validated by presenting flight test
results of up to four drones, autonomously navigating collision-free in a
real-world environment.
Related papers
- Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Goal-Conditioned Reinforcement Learning with Disentanglement-based
Reachability Planning [14.370384505230597]
We propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks.
Our REPlan significantly outperforms the prior state-of-the-art methods in solving temporally extended tasks.
arXiv Detail & Related papers (2023-07-20T13:08:14Z) - Planning Immediate Landmarks of Targets for Model-Free Skill Transfer
across Agents [34.56191646231944]
We propose PILoT, i.e., Planning Immediate Landmarks of Targets.
PILoT learns a goal-conditioned state planner and distills a goal-planner to plan immediate landmarks in a model-free style.
We show the power of PILoT on various transferring challenges, including few-shot transferring across action spaces and dynamics.
arXiv Detail & Related papers (2022-12-18T08:03:21Z) - Long-HOT: A Modular Hierarchical Approach for Long-Horizon Object
Transport [83.06265788137443]
We address key challenges in long-horizon embodied exploration and navigation by proposing a new object transport task and a novel modular framework for temporally extended navigation.
Our first contribution is the design of a novel Long-HOT environment focused on deep exploration and long-horizon planning.
We propose a modular hierarchical transport policy (HTP) that builds a topological graph of the scene to perform exploration with the help of weighted frontiers.
arXiv Detail & Related papers (2022-10-28T05:30:49Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Deep Interactive Motion Prediction and Planning: Playing Games with
Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model.
Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z) - An End-to-end Deep Reinforcement Learning Approach for the Long-term
Short-term Planning on the Frenet Space [0.0]
This paper presents a novel end-to-end continuous deep reinforcement learning approach towards autonomous cars' decision-making and motion planning.
For the first time, we define both states and action spaces on the Frenet space to make the driving behavior less variant to the road curvatures.
The algorithm generates continuoustemporal trajectories on the Frenet frame for the feedback controller to track.
arXiv Detail & Related papers (2020-11-26T02:40:07Z) - UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement
Learning Approach [18.266087952180733]
We propose a new end-to-end reinforcement learning approach to UAV-enabled data collection from Internet of Things (IoT) devices.
An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance.
We show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters.
arXiv Detail & Related papers (2020-07-01T15:14:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.