Combining Reinforcement Learning with Model Predictive Control for
On-Ramp Merging
- URL: http://arxiv.org/abs/2011.08484v3
- Date: Tue, 28 Sep 2021 17:12:22 GMT
- Title: Combining Reinforcement Learning with Model Predictive Control for
On-Ramp Merging
- Authors: Joseph Lubars, Harsh Gupta, Sandeep Chinchali, Liyun Li, Adnan Raja,
R. Srikant, and Xinzhou Wu
- Abstract summary: Two broad classes of techniques have been proposed to solve motion planning problems in autonomous driving: Model Predictive Control (MPC) and Reinforcement Learning (RL)
We first establish the strengths and weaknesses of state-of-the-art MPC and RL-based techniques through simulations.
We subsequently present an algorithm which blends the model-free RL agent with the MPC solution and show that it provides better trade-offs between all metrics -- passenger comfort, efficiency, crash rate and robustness.
- Score: 10.480121529429631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of designing an algorithm to allow a car to
autonomously merge on to a highway from an on-ramp. Two broad classes of
techniques have been proposed to solve motion planning problems in autonomous
driving: Model Predictive Control (MPC) and Reinforcement Learning (RL). In
this paper, we first establish the strengths and weaknesses of state-of-the-art
MPC and RL-based techniques through simulations. We show that the performance
of the RL agent is worse than that of the MPC solution from the perspective of
safety and robustness to out-of-distribution traffic patterns, i.e., traffic
patterns which were not seen by the RL agent during training. On the other
hand, the performance of the RL agent is better than that of the MPC solution
when it comes to efficiency and passenger comfort. We subsequently present an
algorithm which blends the model-free RL agent with the MPC solution and show
that it provides better trade-offs between all metrics -- passenger comfort,
efficiency, crash rate and robustness.
Related papers
- Model-free Learning of Corridor Clearance: A Near-term Deployment
Perspective [5.39179984304986]
An emerging public health application of connected and automated vehicle (CAV) technologies is to reduce response times of emergency medical service (EMS) by indirectly coordinating traffic.
Existing research on this topic often overlooks the impact of EMS vehicle disruptions on regular traffic, assumes 100% CAV penetration, relies on real-time traffic signal timing data and queue lengths at intersections, and makes various assumptions about traffic settings when deriving optimal model-based CAV control strategies.
To overcome these challenges and enhance real-world applicability in near-term, we propose a model-free approach employing deep reinforcement learning (DRL) for designing CAV control strategies
arXiv Detail & Related papers (2023-12-16T06:08:53Z) - Data-efficient Deep Reinforcement Learning for Vehicle Trajectory
Control [6.144517901919656]
Reinforcement learning (RL) promises to achieve control performance superior to classical approaches.
Standard RL approaches like soft-actor critic (SAC) require extensive amounts of training data to be collected.
We apply recently developed data-efficient deep RL methods to vehicle trajectory control.
arXiv Detail & Related papers (2023-11-30T09:38:59Z) - Reinforcement Learning with Model Predictive Control for Highway Ramp Metering [14.389086937116582]
This work explores the synergy between model-based and learning-based strategies to enhance traffic flow management.
The control problem is formulated as an RL task by crafting a suitable stage cost function.
An MPC-based RL approach, which leverages the MPC optimal problem as a function approximation for the RL algorithm, is proposed to learn to efficiently control an on-ramp.
arXiv Detail & Related papers (2023-11-15T09:50:54Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement
Learning Techniques [4.042717292629285]
We present an integrated car-following and lane-changing decision-control system based on Deep Reinforcement Learning (DRL)
We employ the well-known DQN algorithm to train the RL agent to make the appropriate decision accordingly.
We evaluate the performance of the proposed model under two policies; epsilon-greedy policy and Boltzmann policy.
arXiv Detail & Related papers (2023-09-25T15:33:08Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Unified Automatic Control of Vehicular Systems with Reinforcement
Learning [64.63619662693068]
This article contributes a streamlined methodology for vehicular microsimulation.
It discovers high performance control strategies with minimal manual design.
The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering.
arXiv Detail & Related papers (2022-07-30T16:23:45Z) - Policy Search for Model Predictive Control with Application to Agile
Drone Flight [56.24908013905407]
We propose a policy-search-for-model-predictive-control framework for MPC.
Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies.
Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
arXiv Detail & Related papers (2021-12-07T17:39:24Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Learning Vehicle Routing Problems using Policy Optimisation [4.093722933440819]
State-of-the-art approaches learn a policy using reinforcement learning, and the learnt policy acts as a pseudo solver.
These approaches have demonstrated good performance in some cases, but given the large search space typical of routing problem, they can converge too quickly to poor policy.
We propose entropy regularised reinforcement learning (ERRL) that supports exploration by providing more policies.
arXiv Detail & Related papers (2020-12-24T14:18:56Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.