Related papers: Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning

Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2006.10897v1
Date: Thu, 18 Jun 2020 23:37:53 GMT
Title: Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning
Authors: Oscar de Lima, Hansal Shah, Ting-Sheng Chu, Brian Fogelson
Abstract summary: Ride-sharing services such as Uber and Lyft offer a service where passengers can order a car to pick them up. Traditional Reinforcement Learning (RL) based methods attempting to solve the ridesharing problem are unable to accurately model the complex environment in which taxis operate. We show that our model performs better than the IDQN baseline on a fixed grid size and is able to generalize well to smaller or larger grid sizes. Our algorithm is able to outperform IDQN baseline in the scenario where we have a variable number of passengers and cars in each episode.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the advent of ride-sharing services, there is a huge increase in the number of people who rely on them for various needs. Most of the earlier approaches tackling this issue required handcrafted functions for estimating travel times and passenger waiting times. Traditional Reinforcement Learning (RL) based methods attempting to solve the ridesharing problem are unable to accurately model the complex environment in which taxis operate. Prior Multi-Agent Deep RL based methods based on Independent DQN (IDQN) learn decentralized value functions prone to instability due to the concurrent learning and exploring of multiple agents. Our proposed method based on QMIX is able to achieve centralized training with decentralized execution. We show that our model performs better than the IDQN baseline on a fixed grid size and is able to generalize well to smaller or larger grid sizes. Also, our algorithm is able to outperform IDQN baseline in the scenario where we have a variable number of passengers and cars in each episode. Code for our paper is publicly available at: https://github.com/UMich-ML-Group/RL-Ridesharing.

Related papers

Intention-Conditioned Flow Occupancy Models [69.79049994662591]
Large-scale pre-training has fundamentally changed how machine learning research is done today.<n>Applying this same framework to reinforcement learning is appealing because it offers compelling avenues for addressing core challenges in RL.<n>Recent advances in generative AI have provided new tools for modeling highly complex distributions.
arXiv Detail & Related papers (2025-06-10T15:27:46Z)
Digi-Q: Learning Q-Value Functions for Training Device-Control Agents [73.60512136881279]
Digi-Q trains VLM-based action-value Q-functions which are then used to extract the agent policy. Digi-Q outperforms several prior methods on user-scale device control tasks in Android-in-the-Wild.
arXiv Detail & Related papers (2025-02-13T18:55:14Z)
WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving [54.365702251769456]
We present dataset with unprecedented average of 8.4 agents per driving sequence. In addition to providing the largest number of agents and viewpoints among autonomous driving datasets, WHALES records agent behaviors. We conduct experiments on agent scheduling task, where the ego agent selects one of multiple candidate agents to cooperate with.
arXiv Detail & Related papers (2024-11-20T14:12:34Z)
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems. Current value-based RL methods have proven particularly challenging to scale to the setting of large language models. We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z)
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [80.10358123795946]
We develop a framework for building multi-turn RL algorithms for fine-tuning large language models. Our framework adopts a hierarchical RL approach and runs two RL algorithms in parallel. Empirically, we find that ArCHer significantly improves efficiency and performance on agent tasks.
arXiv Detail & Related papers (2024-02-29T18:45:56Z)
Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach [1.0080317855851213]
We consider the problem of network parameter cancellation optimization for networks. We show that deploying an algorithm in the real world for exploration and learning can be achieved with the data without exploring.
arXiv Detail & Related papers (2023-10-12T18:36:36Z)
Multi-Start Team Orienteering Problem for UAS Mission Re-Planning with Data-Efficient Deep Reinforcement Learning [9.877261093287304]
We study a mission re-planning problem where vehicles are initially located away from the depot and have different amounts of fuel. We develop a policy network with self-attention on each partial tour and encoder-decoder attention between the partial tour and the remaining nodes. We propose a modified REINFORCE algorithm where the greedy rollout baseline is replaced by a local mini-batch baseline based on multiple, possibly non-duplicate sample rollouts.
arXiv Detail & Related papers (2023-03-02T15:15:56Z)
Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior. The retrieval process is trained to retrieve information from the dataset that may be useful in the current context. We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z)
DriverGym: Democratising Reinforcement Learning for Autonomous Driving [75.91049219123899]
We propose DriverGym, an open-source environment for developing reinforcement learning algorithms for autonomous driving. DriverGym provides access to more than 1000 hours of expert logged data and also supports reactive and data-driven agent behavior. The performance of an RL policy can be easily validated on real-world data using our extensive and flexible closed-loop evaluation protocol.
arXiv Detail & Related papers (2021-11-12T11:47:08Z)
Distributed Heuristic Multi-Agent Path Finding with Communication [7.854890646114447]
Multi-Agent Path Finding (MAPF) is essential to large-scale robotic systems. Recent methods have applied reinforcement learning (RL) to learn decentralized polices in partially observable environments. This paper combines communication with deep Q-learning to provide a novel learning based method for MAPF.
arXiv Detail & Related papers (2021-06-21T18:50:58Z)
Scalable Deep Reinforcement Learning for Ride-Hailing [0.0]
Ride-hailing services such as Didi Chuxing, Lyft, and Uber arrange thousands of cars to meet ride requests throughout the day. We consider a Markov decision process (MDP) model of a ride-hailing service system, framing it as a reinforcement learning (RL) problem. We propose a special decomposition for the MDP actions by sequentially assigning tasks to the drivers.
arXiv Detail & Related papers (2020-09-27T20:07:12Z)
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents [1.8782750537161614]
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate. We propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions. The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches.
arXiv Detail & Related papers (2020-08-06T15:16:05Z)
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms. SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.