Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2006.10897v1
- Date: Thu, 18 Jun 2020 23:37:53 GMT
- Title: Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning
- Authors: Oscar de Lima, Hansal Shah, Ting-Sheng Chu, Brian Fogelson
- Abstract summary: Ride-sharing services such as Uber and Lyft offer a service where passengers can order a car to pick them up.
Traditional Reinforcement Learning (RL) based methods attempting to solve the ridesharing problem are unable to accurately model the complex environment in which taxis operate.
We show that our model performs better than the IDQN baseline on a fixed grid size and is able to generalize well to smaller or larger grid sizes.
Our algorithm is able to outperform IDQN baseline in the scenario where we have a variable number of passengers and cars in each episode.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the advent of ride-sharing services, there is a huge increase in the
number of people who rely on them for various needs. Most of the earlier
approaches tackling this issue required handcrafted functions for estimating
travel times and passenger waiting times. Traditional Reinforcement Learning
(RL) based methods attempting to solve the ridesharing problem are unable to
accurately model the complex environment in which taxis operate. Prior
Multi-Agent Deep RL based methods based on Independent DQN (IDQN) learn
decentralized value functions prone to instability due to the concurrent
learning and exploring of multiple agents. Our proposed method based on QMIX is
able to achieve centralized training with decentralized execution. We show that
our model performs better than the IDQN baseline on a fixed grid size and is
able to generalize well to smaller or larger grid sizes. Also, our algorithm is
able to outperform IDQN baseline in the scenario where we have a variable
number of passengers and cars in each episode. Code for our paper is publicly
available at: https://github.com/UMich-ML-Group/RL-Ridesharing.
Related papers
- WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving [54.365702251769456]
We present dataset with unprecedented average of 8.4 agents per driving sequence.
In addition to providing the largest number of agents and viewpoints among autonomous driving datasets, WHALES records agent behaviors.
We conduct experiments on agent scheduling task, where the ego agent selects one of multiple candidate agents to cooperate with.
arXiv Detail & Related papers (2024-11-20T14:12:34Z) - Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [80.10358123795946]
We develop a framework for building multi-turn RL algorithms for fine-tuning large language models.
Our framework adopts a hierarchical RL approach and runs two RL algorithms in parallel.
Empirically, we find that ArCHer significantly improves efficiency and performance on agent tasks.
arXiv Detail & Related papers (2024-02-29T18:45:56Z) - Learning RL-Policies for Joint Beamforming Without Exploration: A Batch
Constrained Off-Policy Approach [1.0080317855851213]
We consider the problem of network parameter cancellation optimization for networks.
We show that deploying an algorithm in the real world for exploration and learning can be achieved with the data without exploring.
arXiv Detail & Related papers (2023-10-12T18:36:36Z) - Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with
Data-Efficient Deep Reinforcement Learning [9.877261093287304]
We study a mission re-planning problem where vehicles are initially located away from the depot and have different amounts of fuel.
We develop a policy network with self-attention on each partial tour and encoder-decoder attention between the partial tour and the remaining nodes.
We propose a modified REINFORCE algorithm where the greedy rollout baseline is replaced by a local mini-batch baseline based on multiple, possibly non-duplicate sample rollouts.
arXiv Detail & Related papers (2023-03-02T15:15:56Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - DriverGym: Democratising Reinforcement Learning for Autonomous Driving [75.91049219123899]
We propose DriverGym, an open-source environment for developing reinforcement learning algorithms for autonomous driving.
DriverGym provides access to more than 1000 hours of expert logged data and also supports reactive and data-driven agent behavior.
The performance of an RL policy can be easily validated on real-world data using our extensive and flexible closed-loop evaluation protocol.
arXiv Detail & Related papers (2021-11-12T11:47:08Z) - Distributed Heuristic Multi-Agent Path Finding with Communication [7.854890646114447]
Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems.
Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments.
This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
arXiv Detail & Related papers (2021-06-21T18:50:58Z) - Scalable Deep Reinforcement Learning for Ride-Hailing [0.0]
Ride-hailing services such as Didi Chuxing, Lyft, and Uber arrange thousands of cars to meet ride requests throughout the day.
We consider a Markov decision process (MDP) model of a ride-hailing service system, framing it as a reinforcement learning (RL) problem.
We propose a special decomposition for the MDP actions by sequentially assigning tasks to the drivers.
arXiv Detail & Related papers (2020-09-27T20:07:12Z) - Deep Q-Network Based Multi-agent Reinforcement Learning with Binary
Action Agents [1.8782750537161614]
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate.
We propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions.
The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches.
arXiv Detail & Related papers (2020-08-06T15:16:05Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.