Conditional Expectation based Value Decomposition for Scalable On-Demand
Ride Pooling
- URL: http://arxiv.org/abs/2112.00579v1
- Date: Wed, 1 Dec 2021 15:53:16 GMT
- Title: Conditional Expectation based Value Decomposition for Scalable On-Demand
Ride Pooling
- Authors: Avinandan Bose, Pradeep Varakantham
- Abstract summary: Traditional ride pooling approaches do not consider the impact of current matches on future value for vehicles/drivers.
We show that our new approach, Conditional Expectation based Value Decomposition (CEVD) outperforms NeurADP by up to 9.76% in terms of overall requests served.
- Score: 11.988825533369683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Owing to the benefits for customers (lower prices), drivers (higher
revenues), aggregation companies (higher revenues) and the environment (fewer
vehicles), on-demand ride pooling (e.g., Uber pool, Grab Share) has become
quite popular. The significant computational complexity of matching vehicles to
combinations of requests has meant that traditional ride pooling approaches are
myopic in that they do not consider the impact of current matches on future
value for vehicles/drivers. Recently, Neural Approximate Dynamic Programming
(NeurADP) has employed value decomposition with Approximate Dynamic Programming
(ADP) to outperform leading approaches by considering the impact of an
individual agent's (vehicle) chosen actions on the future value of that agent.
However, in order to ensure scalability and facilitate city-scale ride pooling,
NeurADP completely ignores the impact of other agents actions on individual
agent/vehicle value. As demonstrated in our experimental results, ignoring the
impact of other agents actions on individual value can have a significant
impact on the overall performance when there is increased competition among
vehicles for demand. Our key contribution is a novel mechanism based on
computing conditional expectations through joint conditional probabilities for
capturing dependencies on other agents actions without increasing the
complexity of training or decision making. We show that our new approach,
Conditional Expectation based Value Decomposition (CEVD) outperforms NeurADP by
up to 9.76% in terms of overall requests served, which is a significant
improvement on a city wide benchmark taxi dataset.
Related papers
- Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization [75.1240295759264]
We propose an effective framework for Bridging and Modeling Correlations in pairwise data, named BMC.
We increase the consistency and informativeness of the pairwise preference signals through targeted modifications.
We identify that DPO alone is insufficient to model these correlations and capture nuanced variations.
arXiv Detail & Related papers (2024-08-14T11:29:47Z) - Forecasting Auxiliary Energy Consumption for Electric Heavy-Duty
Vehicles [6.375656754994484]
Energy consumption prediction is crucial for optimizing the operation of electric commercial heavy-duty vehicles.
In this paper, we demonstrate a potential solution by training multiple regression models on subsets of data.
Experiments on both synthetic and real-world datasets show that such splitting of a complex problem into simpler ones yields better regression performance and interpretability.
arXiv Detail & Related papers (2023-11-27T16:52:25Z) - Coalitional Bargaining via Reinforcement Learning: An Application to
Collaborative Vehicle Routing [49.00137468773683]
Collaborative Vehicle Routing is where delivery companies cooperate by sharing their delivery information and performing delivery requests on behalf of each other.
This achieves economies of scale and thus reduces cost, greenhouse gas emissions, and road congestion.
But which company should partner with whom, and how much should each company be compensated?
Traditional game theoretic solution concepts, such as the Shapley value or nucleolus, are difficult to calculate for the real-world problem of Collaborative Vehicle Routing.
arXiv Detail & Related papers (2023-10-26T15:04:23Z) - Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning [48.667697255912614]
Mean-field reinforcement learning addresses the policy of a representative agent interacting with the infinite population of identical agents.
We propose Safe-M$3$-UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions.
Our algorithm effectively meets the demand in critical areas while ensuring service accessibility in regions with low demand.
arXiv Detail & Related papers (2023-06-29T15:57:07Z) - Studying the Impact of Semi-Cooperative Drivers on Overall Highway Flow [76.38515853201116]
Semi-cooperative behaviors are intrinsic properties of human drivers and should be considered for autonomous driving.
New autonomous planners can consider the social value orientation (SVO) of human drivers to generate socially-compliant trajectories.
We present study of implicit semi-cooperative driving where agents deploy a game-theoretic version of iterative best response.
arXiv Detail & Related papers (2023-04-23T16:01:36Z) - Towards More Efficient Shared Autonomous Mobility: A Learning-Based
Fleet Repositioning Approach [0.0]
This paper formulates SAMS fleet as a Markov Decision Process and presents a reinforcement learning-based repositioning (RLR) approach called integrated system-agent repositioning (ISR)
The ISR learns to respond to evolving demand patterns without explicit demand forecasting and to cooperate with optimization-based passenger-to-vehicle assignment.
Results demonstrate the RLR approaches' substantial reductions in passenger wait times, over 50%, relative to the JO approach.
arXiv Detail & Related papers (2022-10-16T23:30:46Z) - Distributional Reward Estimation for Effective Multi-Agent Deep
Reinforcement Learning [19.788336796981685]
We propose a novel Distributional Reward Estimation framework for effective Multi-Agent Reinforcement Learning (DRE-MARL)
Our main idea is to design the multi-action-branch reward estimation and policy-weighted reward aggregation for stabilized training.
The superiority of the DRE-MARL is demonstrated using benchmark multi-agent scenarios, compared with the SOTA baselines in terms of both effectiveness and robustness.
arXiv Detail & Related papers (2022-10-14T08:31:45Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Equilibrium Inverse Reinforcement Learning for Ride-hailing Vehicle
Network [1.599072005190786]
We formulate the problem of passenger-vehicle matching in a sparsely connected graph.
We propose an algorithm to derive an equilibrium policy in a multi-agent environment.
arXiv Detail & Related papers (2021-02-13T03:18:44Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Prediction by Anticipation: An Action-Conditional Prediction Method
based on Interaction Learning [23.321627835039934]
We propose prediction by anticipation, which views interaction in terms of a latent probabilistic generative process.
Under this view, consecutive data frames can be factorized into sequential samples from an action-conditional distribution.
Our proposed prediction model, variational Bayesian in nature, is trained to maximize the evidence lower bound (ELBO) of this conditional distribution.
arXiv Detail & Related papers (2020-12-25T01:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.