Related papers: Multi-Agent Path Planning Using Deep Reinforcement Learning

Multi-Agent Path Planning Using Deep Reinforcement Learning

URL: http://arxiv.org/abs/2110.01460v1
Date: Mon, 4 Oct 2021 13:56:23 GMT
Title: Multi-Agent Path Planning Using Deep Reinforcement Learning
Authors: Mert \c{C}etinkaya
Abstract summary: In this paper a deep reinforcement based multi-agent path planning approach is introduced. The experiments are realized in a simulation environment and in this environment different multi-agent path planning problems are produced. The produced problems are actually similar to a vehicle routing problem and they are solved using multi-agent deep reinforcement learning.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper a deep reinforcement based multi-agent path planning approach is introduced. The experiments are realized in a simulation environment and in this environment different multi-agent path planning problems are produced. The produced problems are actually similar to a vehicle routing problem and they are solved using multi-agent deep reinforcement learning. In the simulation environment, the model is trained on different consecutive problems in this way and, as the time passes, it is observed that the model's performance to solve a problem increases. Always the same simulation environment is used and only the location of target points for the agents to visit is changed. This contributes the model to learn its environment and the right attitude against a problem as the episodes pass. At the end, a model who has already learned a lot to solve a path planning or routing problem in this environment is obtained and this model can already find a nice and instant solution to a given unseen problem even without any training. In routing problems, standard mathematical modeling or heuristics seem to suffer from high computational time to find the solution and it is also difficult and critical to find an instant solution. In this paper a new solution method against these points is proposed and its efficiency is proven experimentally.

Related papers

Efficient Imitation Learning with Conservative World Models [54.52140201148341]
We tackle the problem of policy learning from expert demonstrations without a reward function. We re-frame imitation learning as a fine-tuning problem, rather than a pure reinforcement learning one.
arXiv Detail & Related papers (2024-05-21T20:53:18Z)
Solving the Team Orienteering Problem with Transformers [46.93254771681026]
Route planning for a fleet of vehicles is an important task in applications such as package delivery, surveillance, or transportation. This paper presents a multi-agent route planning system capable of solving the Team Orienteering Problem in a very fast and accurate manner.
arXiv Detail & Related papers (2023-11-30T16:10:35Z)
Parallel Automatic History Matching Algorithm Using Reinforcement Learning [0.0]
Reformulating the history matching problem into a Markov Decision Process introduces a method in which reinforcement learning can be utilized to solve the problem. This method provides a mechanism where an artificial deep neural network agent can interact with the reservoir simulator and find multiple different solutions to the problem.
arXiv Detail & Related papers (2022-11-14T15:09:39Z)
Gradient Optimization for Single-State RMDPs [0.0]
Modern problems such as autonomous driving, control of robotic components, and medical diagnostics have become increasingly difficult to solve analytically. Data-driven solutions are a strong option where there are problems with more dimensions of complexity than can be understood by people. Unfortunately, data-driven models often come with uncertainty in how they will perform in the worst of scenarios. In fields such as autonomous driving and medicine, the consequences of these failures could be catastrophic.
arXiv Detail & Related papers (2022-09-25T18:50:02Z)
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman Problem [18.735056206844202]
We show that we can construct a model capable of learning without supervision and inferences significantly faster than current autoregressive approaches. We also empirically evaluate the benefits of including optimal transport algorithms within deep learning models to enforce assignment constraints during end-to-end training.
arXiv Detail & Related papers (2022-03-02T07:21:56Z)
Minimizing Entropy to Discover Good Solutions to Recurrent Mixed Integer Programs [0.0]
Current solvers for mixed-integer programming (MIP) problems are designed to perform well on a wide range of problems. Recent works have shown that machine learning (ML) can be integrated with an MIP solver to inject domain knowledge and efficiently close the optimality gap. This paper proposes an online solver that uses the notion of entropy to efficiently build a model with minimal training data and tuning.
arXiv Detail & Related papers (2022-02-07T18:52:56Z)
Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable Grid Environments [62.997667081978825]
We consider the problem of multi-agent navigation in partially observable grid environments. We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
arXiv Detail & Related papers (2021-08-13T09:44:47Z)
Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences. We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion. The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z)
GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction [69.94831587339539]
We present a modified Cross-Entropy Method (CEM) that uses a masked auto-regressive neural network for modeling uniform distributions over the solution space. Our algorithm is able to express complicated solution spaces, thus allowing it to track a variety of different solution regions.
arXiv Detail & Related papers (2020-02-17T20:21:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.