Related papers: Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning

Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning

URL: http://arxiv.org/abs/2203.10823v1
Date: Mon, 21 Mar 2022 09:16:56 GMT
Title: Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning
Authors: Marc R. Schlichting, Stefan Notter, and Walter Fichter
Abstract summary: Reinforcement learning is used to train a policy network that accommodates desirable path planning behaviors. A Long Short-Term Memory module is proposed to encode an unspecified number of states for a varying, indefinite number of agents. The proposed approach is validated by presenting flight test results of up to four drones, autonomously navigating collision-free in a real-world environment.
Score: 0.34410212782758043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning-based path planning for multi-agent systems of varying size constitutes a research topic with increasing significance as progress in domains such as urban air mobility and autonomous aerial vehicles continues. Reinforcement learning with continuous state and action spaces is used to train a policy network that accommodates desirable path planning behaviors and can be used for time-critical applications. A Long Short-Term Memory module is proposed to encode an unspecified number of states for a varying, indefinite number of agents. The described training strategies and policy architecture lead to a guidance that scales to an infinite number of agents and unlimited physical dimensions, although training takes place at a smaller scale. The guidance is implemented on a low-cost, off-the-shelf onboard computer. The feasibility of the proposed approach is validated by presenting flight test results of up to four drones, autonomously navigating collision-free in a real-world environment.

Related papers

Variable Time-Step MPC for Agile Multi-Rotor UAV Interception of Dynamic Targets [6.0967385124149756]
Agile planning using existing non-linear model predictive control methods is limited by the number of planning steps as it becomes increasingly demanding. In this paper, we propose to address these limitations by introducing variable time steps and coupling them with the prediction horizon length. A simplified point-mass motion primitive is used to leverage the differential flatness of quadrotor dynamics and the trajectory generation of feasible trajectories in the flat output space.
arXiv Detail & Related papers (2025-03-18T11:59:24Z)
Safe Multi-Agent Navigation guided by Goal-Conditioned Safe Reinforcement Learning [2.082168997977094]
We introduce a novel method that integrates the strengths of both planning and safe RL. Our method prunes unsafe edges and generates a waypoint-based plan that the agent follows until reaching its goal. In particular, we leverage Conflict-Based Search (CBS) to create waypoint-based plans for multiple agents allowing for their safe navigation over extended horizons.
arXiv Detail & Related papers (2025-02-25T03:38:52Z)
SCoTT: Wireless-Aware Path Planning with Vision Language Models and Strategic Chains-of-Thought [78.53885607559958]
A novel approach using vision language models (VLMs) is proposed for enabling path planning in complex wireless-aware environments. To this end, insights from a digital twin with real-world wireless ray tracing data are explored. Results show that SCoTT achieves very close average path gains compared to DP-WA* while at the same time yielding consistently shorter path lengths.
arXiv Detail & Related papers (2024-11-27T10:45:49Z)
Multi-agent Path Finding for Timed Tasks using Evolutionary Games [1.3023548510259344]
We show that our algorithm is faster than deep RL methods by at least an order of magnitude. Our results indicate that it scales better with an increase in the number of agents as compared to other methods.
arXiv Detail & Related papers (2024-11-15T20:10:25Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. In this paper, we propose an adaptive scheme for action quantization. We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z)
Goal-Conditioned Reinforcement Learning with Disentanglement-based Reachability Planning [14.370384505230597]
We propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks. Our REPlan significantly outperforms the prior state-of-the-art methods in solving temporally extended tasks.
arXiv Detail & Related papers (2023-07-20T13:08:14Z)
Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents [34.56191646231944]
We propose PILoT, i.e., Planning Immediate Landmarks of Targets. PILoT learns a goal-conditioned state planner and distills a goal-planner to plan immediate landmarks in a model-free style. We show the power of PILoT on various transferring challenges, including few-shot transferring across action spaces and dynamics.
arXiv Detail & Related papers (2022-12-18T08:03:21Z)
Long-HOT: A Modular Hierarchical Approach for Long-Horizon Object Transport [83.06265788137443]
We address key challenges in long-horizon embodied exploration and navigation by proposing a new object transport task and a novel modular framework for temporally extended navigation. Our first contribution is the design of a novel Long-HOT environment focused on deep exploration and long-horizon planning. We propose a modular hierarchical transport policy (HTP) that builds a topological graph of the scene to perform exploration with the help of weighted frontiers.
arXiv Detail & Related papers (2022-10-28T05:30:49Z)
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command. We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z)
Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model. Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z)
UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach [18.266087952180733]
We propose a new end-to-end reinforcement learning approach to UAV-enabled data collection from Internet of Things (IoT) devices. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. We show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters.
arXiv Detail & Related papers (2020-07-01T15:14:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.