Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle
Relocation and Pricing Decisions
- URL: http://arxiv.org/abs/2111.03204v1
- Date: Fri, 5 Nov 2021 00:52:15 GMT
- Title: Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle
Relocation and Pricing Decisions
- Authors: Enpeng Yuan, Pascal Van Hentenryck
- Abstract summary: Large-scale ride-hailing systems often combine real-time routing at the individual request level with a macroscopic Model Predictive Control (MPC) optimization for dynamic pricing and vehicle relocation.
This paper addresses these computational challenges by learning the MPC optimization.
The resulting machine-learning model then serves as the optimization proxy and predicts its optimal solutions.
- Score: 15.80796896560034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale ride-hailing systems often combine real-time routing at the
individual request level with a macroscopic Model Predictive Control (MPC)
optimization for dynamic pricing and vehicle relocation. The MPC relies on a
demand forecast and optimizes over a longer time horizon to compensate for the
myopic nature of the routing optimization. However, the longer horizon
increases computational complexity and forces the MPC to operate at coarser
spatial-temporal granularity, degrading the quality of its decisions. This
paper addresses these computational challenges by learning the MPC
optimization. The resulting machine-learning model then serves as the
optimization proxy and predicts its optimal solutions. This makes it possible
to use the MPC at higher spatial-temporal fidelity, since the optimizations can
be solved and learned offline. Experimental results show that the proposed
approach improves quality of service on challenging instances from the New York
City dataset.
Related papers
- Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Margin Matching Preference Optimization: Enhanced Model Alignment with Granular Feedback [64.67540769692074]
Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date.
We introduce an approach called Margin Matching Preference Optimization (MMPO), which incorporates relative quality margins into optimization, leading to improved LLM policies and reward models.
Experiments with both human and AI feedback data demonstrate that MMPO consistently outperforms baseline methods, often by a substantial margin, on popular benchmarks including MT-bench and RewardBench.
arXiv Detail & Related papers (2024-10-04T04:56:11Z) - Deep Model Predictive Optimization [21.22047409735362]
A major challenge in robotics is to design robust policies which enable complex and agile behaviors in the real world.
We propose Deep Model Predictive Optimization (DMPO), which learns the inner-loop of an MPC optimization algorithm directly via experience.
DMPO can outperform the best MPC algorithm by up to 27% with fewer samples and an end-to-end policy trained with MFRL by 19%.
arXiv Detail & Related papers (2023-10-06T21:11:52Z) - An Automatic Tuning MPC with Application to Ecological Cruise Control [0.0]
We show an approach for online automatic tuning of an MPC controller with an example application to an ecological cruise control system.
We solve the global fuel consumption minimization problem offline using dynamic programming and find the corresponding MPC cost function.
A neural network fitted to these offline results is used to generate the desired MPC cost function weight during online operation.
arXiv Detail & Related papers (2023-09-17T19:49:47Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Bayesian Optimization and Deep Learning forsteering wheel angle
prediction [58.720142291102135]
This work aims to obtain an accurate model for the prediction of the steering angle in an automated driving system.
BO was able to identify, within a limited number of trials, a model -- namely BOST-LSTM -- which resulted, the most accurate when compared to classical end-to-end driving models.
arXiv Detail & Related papers (2021-10-22T15:25:14Z) - Neural Predictive Control for the Optimization of Smart Grid Flexibility
Schedules [0.0]
Model predictive control (MPC) is a method to formulate the optimal scheduling problem for grid flexibilities in a mathematical manner.
MPC methods promise accurate results for time-constrained grid optimization but they are inherently limited by the calculation time needed for large and complex power system models.
A Neural Predictive Control scheme is proposed to learn optimal control policies for linear and nonlinear power systems through imitation.
arXiv Detail & Related papers (2021-08-19T15:12:35Z) - Learning Model-Based Vehicle-Relocation Decisions for Real-Time
Ride-Sharing: Hybridizing Learning and Optimization [15.80796896560034]
Large-scale ride-sharing systems combine real-time dispatching and routing optimization over a rolling time horizon.
MPC component that relocates idle vehicles to anticipate the demand operates over a longer time horizon.
This paper proposes a hybrid approach that combines machine learning and optimization.
arXiv Detail & Related papers (2021-05-27T21:48:05Z) - Optimal Cost Design for Model Predictive Control [30.86835688868485]
Many robotics domains use non model control (MPC) for planning, which sets a reduced time horizon, performs optimization, and replans at every step.
In this work, we challenge the common assumption that the cost we optimize using MPC should be the same as the ground truth cost for the task (plus a terminal cost)
We propose a zeroth-order trajectory-based approach that enables us to design optimal costs for an MPC planning robot in continuous MDPs.
arXiv Detail & Related papers (2021-04-23T00:00:58Z) - Automatically Learning Compact Quality-aware Surrogates for Optimization
Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.
Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making.
We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.