Cooperative Advisory Residual Policies for Congestion Mitigation
- URL: http://arxiv.org/abs/2407.00553v1
- Date: Sun, 30 Jun 2024 01:10:13 GMT
- Title: Cooperative Advisory Residual Policies for Congestion Mitigation
- Authors: Aamir Hasan, Neeloy Chakraborty, Haonan Chen, Jung-Hoon Cho, Cathy Wu, Katherine Driggs-Campbell,
- Abstract summary: We develop a class of learned residual policies that can be used in cooperative advisory systems.
Our policies advise drivers to behave in ways that mitigate traffic congestion while accounting for diverse driver behaviors.
Our approaches successfully mitigate congestion while adapting to different driver behaviors.
- Score: 11.33450610735004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fleets of autonomous vehicles can mitigate traffic congestion through simple actions, thus improving many socioeconomic factors such as commute time and gas costs. However, these approaches are limited in practice as they assume precise control over autonomous vehicle fleets, incur extensive installation costs for a centralized sensor ecosystem, and also fail to account for uncertainty in driver behavior. To this end, we develop a class of learned residual policies that can be used in cooperative advisory systems and only require the use of a single vehicle with a human driver. Our policies advise drivers to behave in ways that mitigate traffic congestion while accounting for diverse driver behaviors, particularly drivers' reactions to instructions, to provide an improved user experience. To realize such policies, we introduce an improved reward function that explicitly addresses congestion mitigation and driver attitudes to advice. We show that our residual policies can be personalized by conditioning them on an inferred driver trait that is learned in an unsupervised manner with a variational autoencoder. Our policies are trained in simulation with our novel instruction adherence driver model, and evaluated in simulation and through a user study (N=16) to capture the sentiments of human drivers. Our results show that our approaches successfully mitigate congestion while adapting to different driver behaviors, with up to 20% and 40% improvement as measured by a combination metric of speed and deviations in speed across time over baselines in our simulation tests and user study, respectively. Our user study further shows that our policies are human-compatible and personalize to drivers.
Related papers
- Lessons in Cooperation: A Qualitative Analysis of Driver Sentiments towards Real-Time Advisory Systems from a Driving Simulator User Study [12.010221998198423]
We conduct a driving simulator study (N=16) to capture driver reactions to a Cooperative RTA system.
We qualitatively analyze the sentiments of drivers towards advisory systems and discuss driver preferences for various aspects of the interaction.
We comment on how the advice should be communicated, the effects of the advice on driver trust, and how drivers adapt to the system.
arXiv Detail & Related papers (2024-06-29T23:21:42Z) - Conformal Policy Learning for Sensorimotor Control Under Distribution
Shifts [61.929388479847525]
This paper focuses on the problem of detecting and reacting to changes in the distribution of a sensorimotor controller's observables.
The key idea is the design of switching policies that can take conformal quantiles as input.
We show how to design such policies by using conformal quantiles to switch between base policies with different characteristics.
arXiv Detail & Related papers (2023-11-02T17:59:30Z) - PeRP: Personalized Residual Policies For Congestion Mitigation Through
Co-operative Advisory Systems [12.010221998198423]
Piecewise Constant (PC) Policies address issues by structurally modeling the likeness of human driving to reduce traffic congestion.
We develop a co-operative advisory system based on PC policies with a novel driver trait conditioned Personalized Residual Policy, PeRP.
We show that our approach successfully mitigates congestion while adapting to different driver behaviors, with 4 to 22% improvement in average speed over baselines.
arXiv Detail & Related papers (2023-08-01T22:25:40Z) - Robust Driving Policy Learning with Guided Meta Reinforcement Learning [49.860391298275616]
We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy.
By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy.
We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
arXiv Detail & Related papers (2023-07-19T17:42:36Z) - Studying the Impact of Semi-Cooperative Drivers on Overall Highway Flow [76.38515853201116]
Semi-cooperative behaviors are intrinsic properties of human drivers and should be considered for autonomous driving.
New autonomous planners can consider the social value orientation (SVO) of human drivers to generate socially-compliant trajectories.
We present study of implicit semi-cooperative driving where agents deploy a game-theoretic version of iterative best response.
arXiv Detail & Related papers (2023-04-23T16:01:36Z) - FastRLAP: A System for Learning High-Speed Driving via Deep RL and
Autonomous Practicing [71.76084256567599]
We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL)
Our system, FastRLAP (faster lap), trains autonomously in the real world, without human interventions, and without requiring any simulation or expert demonstrations.
The resulting policies exhibit emergent aggressive driving skills, such as timing braking and acceleration around turns and avoiding areas which impede the robot's motion, approaching the performance of a human driver using a similar first-person interface over the course of training.
arXiv Detail & Related papers (2023-04-19T17:33:47Z) - Decision Making for Autonomous Driving in Interactive Merge Scenarios
via Learning-based Prediction [39.48631437946568]
This paper focuses on the complex task of merging into moving traffic where uncertainty emanates from the behavior of other drivers.
We frame the problem as a partially observable Markov decision process (POMDP) and solve it online with Monte Carlo tree search.
The solution to the POMDP is a policy that performs high-level driving maneuvers, such as giving way to an approaching car, keeping a safe distance from the vehicle in front or merging into traffic.
arXiv Detail & Related papers (2023-03-29T16:12:45Z) - Exploring the trade off between human driving imitation and safety for
traffic simulation [0.34410212782758043]
We show that a trade-off exists between imitating human driving and maintaining safety when learning driving policies.
We propose a multi objective learning algorithm (MOPPO) that improves both objectives together.
arXiv Detail & Related papers (2022-08-09T14:30:19Z) - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked
Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving.
Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z) - Learning Interactive Driving Policies via Data-driven Simulation [125.97811179463542]
Data-driven simulators promise high data-efficiency for driving policy learning.
Small underlying datasets often lack interesting and challenging edge cases for learning interactive driving.
We propose a simulation method that uses in-painted ado vehicles for learning robust driving policies.
arXiv Detail & Related papers (2021-11-23T20:14:02Z) - Building Safer Autonomous Agents by Leveraging Risky Driving Behavior
Knowledge [1.52292571922932]
This study focuses on creating risk prone scenarios with heavy traffic and unexpected random behavior for creating better model-free learning agents.
We generate multiple autonomous driving scenarios by creating new custom Markov Decision Process (MDP) environment iterations in highway-env simulation package.
We train model free learning agents with supplement information of risk prone driving scenarios and compare their performance with baseline agents.
arXiv Detail & Related papers (2021-03-16T23:39:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.