Where the Action is: Let's make Reinforcement Learning for Stochastic
Dynamic Vehicle Routing Problems work!
- URL: http://arxiv.org/abs/2103.00507v1
- Date: Sun, 28 Feb 2021 13:26:35 GMT
- Title: Where the Action is: Let's make Reinforcement Learning for Stochastic
Dynamic Vehicle Routing Problems work!
- Authors: Florentin D Hildebrandt, Barrett Thomas, Marlin W Ulmer
- Abstract summary: Demand for real-time, instant mobility and delivery services grows.
dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions.
For solving SDVRPs, joint work of both communities is needed.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been a paradigm-shift in urban logistic services in the last years;
demand for real-time, instant mobility and delivery services grows. This poses
new challenges to logistic service providers as the underlying stochastic
dynamic vehicle routing problems (SDVRPs) require anticipatory real-time
routing actions. Searching the combinatorial action space for efficient routing
actions is by itself a complex task of mixed-integer programming (MIP)
well-known by the operations research community. This complexity is now
multiplied by the challenge of evaluating such actions with respect to their
effectiveness given future dynamism and uncertainty, a potentially ideal case
for reinforcement learning (RL) well-known by the computer science community.
For solving SDVRPs, joint work of both communities is needed, but as we show,
essentially non-existing. Both communities focus on their individual strengths
leaving potential for improvement. Our survey paper highlights this potential
in research originating from both communities. We point out current obstacles
in SDVRPs and guide towards joint approaches to overcome them.
Related papers
- Dual Policy Reinforcement Learning for Real-time Rebalancing in Bike-sharing Systems [13.083156894368532]
Bike-sharing systems play a crucial role in easing traffic congestion and promoting healthier lifestyles.
This study introduces a novel approach to address the real-time rebalancing problem with a fleet of vehicles.
It employs a dual policy reinforcement learning algorithm that decouples inventory and routing decisions.
arXiv Detail & Related papers (2024-06-02T21:05:23Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - A Reinforcement Learning Approach for Dynamic Rebalancing in
Bike-Sharing System [11.237099288412558]
Bike-Sharing Systems provide eco-friendly urban mobility, contributing to the alleviation of traffic congestion and healthier lifestyles.
Devising effective rebalancing strategies using vehicles to redistribute bikes among stations is therefore of uttermost importance for operators.
This paper introduces atemporal reinforcement learning algorithm for the dynamic rebalancing problem with multiple vehicles.
arXiv Detail & Related papers (2024-02-05T23:46:42Z) - Sim-to-Real Causal Transfer: A Metric Learning Approach to
Causally-Aware Interaction Representations [62.48505112245388]
We take an in-depth look at the causal awareness of modern representations of agent interactions.
We show that recent representations are already partially resilient to perturbations of non-causal agents.
We propose a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z) - Latent Exploration for Reinforcement Learning [87.42776741119653]
In Reinforcement Learning, agents learn policies by exploring and interacting with the environment.
We propose LATent TIme-Correlated Exploration (Lattice), a method to inject temporally-correlated noise into the latent state of the policy network.
arXiv Detail & Related papers (2023-05-31T17:40:43Z) - An Online Approach to Solve the Dynamic Vehicle Routing Problem with
Stochastic Trip Requests for Paratransit Services [5.649212162857776]
We propose a fully online approach to solve the dynamic vehicle routing problem (DVRP)
It is difficult to batch paratransit requests together as they are temporally sparse.
We use Monte Carlo tree search to evaluate actions for any given state.
arXiv Detail & Related papers (2022-03-28T22:15:52Z) - Q-Mixing Network for Multi-Agent Pathfinding in Partially Observable
Grid Environments [62.997667081978825]
We consider the problem of multi-agent navigation in partially observable grid environments.
We suggest utilizing the reinforcement learning approach when the agents, first, learn the policies that map observations to actions and then follow these policies to reach their goals.
arXiv Detail & Related papers (2021-08-13T09:44:47Z) - Independent Reinforcement Learning for Weakly Cooperative Multiagent
Traffic Control Problem [22.733542222812158]
We use independent reinforcement learning (IRL) to solve a complex traffic cooperative control problem in this study.
To this, we model the traffic control problem as a partially observable weak cooperative traffic model (PO-WCTM) to optimize the overall traffic situation of a group of intersections.
Experimental results show that CIL-DDQN outperforms other methods in almost all performance indicators of the traffic control problem.
arXiv Detail & Related papers (2021-04-22T07:55:46Z) - Flatland Competition 2020: MAPF and MARL for Efficient Train
Coordination on a Grid World [49.80905654161763]
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP)
The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur.
The ever-growing complexity of modern railway networks makes dynamic real-time scheduling of traffic virtually impossible.
arXiv Detail & Related papers (2021-03-30T17:13:29Z) - Learning Vehicle Routing Problems using Policy Optimisation [4.093722933440819]
State-of-the-art approaches learn a policy using reinforcement learning, and the learnt policy acts as a pseudo solver.
These approaches have demonstrated good performance in some cases, but given the large search space typical of routing problem, they can converge too quickly to poor policy.
We propose entropy regularised reinforcement learning (ERRL) that supports exploration by providing more policies.
arXiv Detail & Related papers (2020-12-24T14:18:56Z) - ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for
Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals.
Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments.
ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.