Related papers: Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning

Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning

URL: http://arxiv.org/abs/2303.01963v1
Date: Thu, 2 Mar 2023 15:15:56 GMT
Title: Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning
Authors: Dong Ho Lee and Jaemyung Ahn
Abstract summary: We study a mission re-planning problem where vehicles are initially located away from the depot and have different amounts of fuel. We develop a policy network with self-attention on each partial tour and encoder-decoder attention between the partial tour and the remaining nodes. We propose a modified REINFORCE algorithm where the greedy rollout baseline is replaced by a local mini-batch baseline based on multiple, possibly non-duplicate sample rollouts.
Score: 9.877261093287304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study the Multi-Start Team Orienteering Problem (MSTOP), a mission re-planning problem where vehicles are initially located away from the depot and have different amounts of fuel. We consider/assume the goal of multiple vehicles is to travel to maximize the sum of collected profits under resource (e.g., time, fuel) consumption constraints. Such re-planning problems occur in a wide range of intelligent UAS applications where changes in the mission environment force the operation of multiple vehicles to change from the original plan. To solve this problem with deep reinforcement learning (RL), we develop a policy network with self-attention on each partial tour and encoder-decoder attention between the partial tour and the remaining nodes. We propose a modified REINFORCE algorithm where the greedy rollout baseline is replaced by a local mini-batch baseline based on multiple, possibly non-duplicate sample rollouts. By drawing multiple samples per training instance, we can learn faster and obtain a stable policy gradient estimator with significantly fewer instances. The proposed training algorithm outperforms the conventional greedy rollout baseline, even when combined with the maximum entropy objective.

Related papers

Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms [55.78505925402658]
Vehicle Routing Problems (VRP) are an extension of the Traveling Salesperson Problem and are a fundamental NP-hard challenge in Evolutionary optimization. We introduce a novel optimization framework that uses a reinforcement learning agent - trained on prior instances - to quickly generate initial solutions, which are then further optimized by genetic algorithms. For example, EARLI handles vehicle routing with 500 locations within 1s, 10x faster than current solvers for the same solution quality, enabling applications like real-time and interactive routing.
arXiv Detail & Related papers (2025-04-08T15:21:01Z)
Non-orthogonal Age-Optimal Information Dissemination in Vehicular Networks: A Meta Multi-Objective Reinforcement Learning Approach [0.0]
A roadside unit (RSU) provides timely updates about a set of physical processes to vehicles. The formulated problem is a multi-objective mixed-integer nonlinear programming problem. We develop a hybrid deep Q-network (DQN)-deep deterministic policy gradient (DDPG) model to solve each optimization sub-problem.
arXiv Detail & Related papers (2024-02-15T16:51:47Z)
Solving Complex Multi-UAV Mission Planning Problems using Multi-objective Genetic Algorithms [4.198865250277024]
This paper presents a new Multi-Objective Genetic Algorithm for solving complex Mission Planning Problems (MPP) A hybrid fitness function has been designed using a Constraint Satisfaction Problem (CSP) to check if solutions are valid. Experimental results show that the new algorithm is able to obtain good solutions, however as the problem becomes more complex, the optimal solutions also become harder to find.
arXiv Detail & Related papers (2024-02-09T16:13:21Z)
TOP-Former: A Multi-Agent Transformer Approach for the Team Orienteering Problem [47.40841984849682]
Route planning for a fleet of vehicles is an important task in applications such as package delivery, surveillance, or transportation. We introduce TOP-Former, a multi-agent route planning neural network designed to efficiently and accurately solve the Team Orienteering Problem.
arXiv Detail & Related papers (2023-11-30T16:10:35Z)
Genetic Algorithms with Neural Cost Predictor for Solving Hierarchical Vehicle Routing Problems [20.684353068460375]
When vehicle routing decisions are intertwined with higher-level decisions, the resulting optimization problems pose significant challenges for computation. We propose a novel deep-learning-based approach called Genetic Algorithm with Neural Cost Predictor (GANCP) to tackle the challenge. In particular, our proposed neural network learns the objective values of the HGS-CVRP open-source package that solves capacitated vehicle routing problems.
arXiv Detail & Related papers (2023-10-22T02:46:37Z)
Energy-Guided Continuous Entropic Barycenter Estimation for General Costs [95.33926437521046]
We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functions. Our approach is built upon the dual reformulation of the EOT problem based on weak OT.
arXiv Detail & Related papers (2023-10-02T11:24:36Z)
DC-MRTA: Decentralized Multi-Robot Task Allocation and Navigation in Complex Environments [55.204450019073036]
We present a novel reinforcement learning based task allocation and decentralized navigation algorithm for mobile robots in warehouse environments. We consider the problem of joint decentralized task allocation and navigation and present a two level approach to solve it. We observe improvement up to 14% in terms of task completion time and up-to 40% improvement in terms of computing collision-free trajectories for the robots.
arXiv Detail & Related papers (2022-09-07T00:35:27Z)
Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs. Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z)
Scalable Vehicle Re-Identification via Self-Supervision [66.2562538902156]
Vehicle Re-Identification is one of the key elements in city-scale vehicle analytics systems. Many state-of-the-art solutions for vehicle re-id mostly focus on improving the accuracy on existing re-id benchmarks and often ignore computational complexity. We propose a simple yet effective hybrid solution empowered by self-supervised training which only uses a single network during inference time.
arXiv Detail & Related papers (2022-05-16T12:14:42Z)
Supervised Permutation Invariant Networks for Solving the CVRP with Bounded Fleet Size [3.5235974685889397]
Learning to solve optimization problems, such as the vehicle routing problem, offers great computational advantages. We propose a powerful supervised deep learning framework that constructs a complete tour plan from scratch while respecting an apriori fixed number of vehicles. In combination with an efficient post-processing scheme, our supervised approach is not only much faster and easier to train but also competitive results.
arXiv Detail & Related papers (2022-01-05T10:32:18Z)
Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning [0.0]
Ride-sharing services such as Uber and Lyft offer a service where passengers can order a car to pick them up. Traditional Reinforcement Learning (RL) based methods attempting to solve the ridesharing problem are unable to accurately model the complex environment in which taxis operate. We show that our model performs better than the IDQN baseline on a fixed grid size and is able to generalize well to smaller or larger grid sizes. Our algorithm is able to outperform IDQN baseline in the scenario where we have a variable number of passengers and cars in each episode.
arXiv Detail & Related papers (2020-06-18T23:37:53Z)
Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings. We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity. We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z)
Reinforcement Learning Based Vehicle-cell Association Algorithm for Highly Mobile Millimeter Wave Communication [53.47785498477648]
This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks. We first formulate the user state (VU) problem as a discrete non-vehicle association optimization problem. The proposed solution achieves up to 15% gains in terms sum of user complexity and 20% reduction in VUE compared to several baseline designs.
arXiv Detail & Related papers (2020-01-22T08:51:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.