Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning
with Application to Autonomous Driving
- URL: http://arxiv.org/abs/2006.13704v1
- Date: Mon, 22 Jun 2020 01:41:13 GMT
- Title: Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning
with Application to Autonomous Driving
- Authors: Zheng Wu, Liting Sun, Wei Zhan, Chenyu Yang, Masayoshi Tomizuka
- Abstract summary: We present an efficient sampling-based maximum-entropy inverse reinforcement learning (IRL) algorithm in this paper.
We evaluate the proposed algorithm on real driving data, including both non-interactive and interactive scenarios.
- Score: 35.44498286245894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the past decades, we have witnessed significant progress in the domain of
autonomous driving. Advanced techniques based on optimization and reinforcement
learning (RL) become increasingly powerful at solving the forward problem:
given designed reward/cost functions, how should we optimize them and obtain
driving policies that interact with the environment safely and efficiently.
Such progress has raised another equally important question: \emph{what should
we optimize}? Instead of manually specifying the reward functions, it is
desired that we can extract what human drivers try to optimize from real
traffic data and assign that to autonomous vehicles to enable more naturalistic
and transparent interaction between humans and intelligent agents. To address
this issue, we present an efficient sampling-based maximum-entropy inverse
reinforcement learning (IRL) algorithm in this paper. Different from existing
IRL algorithms, by introducing an efficient continuous-domain trajectory
sampler, the proposed algorithm can directly learn the reward functions in the
continuous domain while considering the uncertainties in demonstrated
trajectories from human drivers. We evaluate the proposed algorithm on real
driving data, including both non-interactive and interactive scenarios. The
experimental results show that the proposed algorithm achieves more accurate
prediction performance with faster convergence speed and better generalization
compared to other baseline IRL algorithms.
Related papers
- Rethinking Optimal Transport in Offline Reinforcement Learning [64.56896902186126]
In offline reinforcement learning, the data is provided by various experts and some of them can be sub-optimal.
To extract an efficient policy, it is necessary to emphstitch the best behaviors from the dataset.
We present an algorithm that aims to find a policy that maps states to a emphpartial distribution of the best expert actions for each given state.
arXiv Detail & Related papers (2024-10-17T22:36:43Z) - Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Bi-Level Optimization Augmented with Conditional Variational Autoencoder
for Autonomous Driving in Dense Traffic [0.9281671380673306]
This paper presents a parameterized bi-level optimization that jointly computes the optimal behavioural decisions and the resulting trajectory.
Our approach runs in real-time using a custom GPU-accelerated batch, and a Variational Autoencoder learnt warm-start strategy.
Our approach outperforms state-of-the-art model predictive control and RL approaches in terms of collision rate while being competitive in driving efficiency.
arXiv Detail & Related papers (2022-12-05T12:56:42Z) - Fast and computationally efficient generative adversarial network
algorithm for unmanned aerial vehicle-based network coverage optimization [1.2853186701496802]
The challenge of dynamic traffic demand in mobile networks is tackled by moving cells based on unmanned aerial vehicles.
Considering the tremendous potential of unmanned aerial vehicles in the future, we propose a new algorithm for coverage optimization.
The proposed algorithm is implemented based on a conditional generative adversarial neural network, with a unique multilayer sum-pooling loss function.
arXiv Detail & Related papers (2022-03-25T12:13:21Z) - Dynamic Origin-Destination Matrix Estimation in Urban Traffic Networks [0.05735035463793007]
We model the problem as a bi-level optimization problem.
In the inner level, given a tentative travel demand, we solve a dynamic traffic assignment problem to decide the routing of the users between their origins and destinations.
In the outer level, we adjust the number of trips and their origins and destinations, aiming at minimizing the discrepancy between the counters generated in the inner level and the given vehicle counts measured by sensors in the traffic network.
arXiv Detail & Related papers (2022-01-31T21:33:46Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - Model-based Decision Making with Imagination for Autonomous Parking [50.41076449007115]
The proposed algorithm consists of three parts: an imaginative model for anticipating results before parking, an improved rapid-exploring random tree (RRT) and a path smoothing module.
Our algorithm is based on a real kinematic vehicle model; which makes it more suitable for algorithm application on real autonomous cars.
In order to evaluate the algorithm's effectiveness, we have compared our algorithm with traditional RRT, within three different parking scenarios.
arXiv Detail & Related papers (2021-08-25T18:24:34Z) - Integrated Decision and Control: Towards Interpretable and Efficient
Driving Intelligence [13.589285628074542]
We present an interpretable and efficient decision and control framework for automated vehicles.
It decomposes the driving task into multi-path planning and optimal tracking that are structured hierarchically.
Results show that our method has better online computing efficiency and driving performance including traffic efficiency and safety.
arXiv Detail & Related papers (2021-03-18T14:43:31Z) - Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement
Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms.
Our approach learns ride-based state-value function using a batch training algorithm with deep value.
We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z) - Sample Efficient Interactive End-to-End Deep Learning for Self-Driving
Cars with Selective Multi-Class Safe Dataset Aggregation [0.13048920509133805]
End-to-end imitation learning is a popular method for computing self-driving car policies.
Standard approach relies on collecting pairs of inputs (camera images) and outputs (steering angle, etc.) from an expert policy and fitting a deep neural network to this data to learn the driving policy.
arXiv Detail & Related papers (2020-07-29T08:38:00Z) - DADA: Differentiable Automatic Data Augmentation [58.560309490774976]
We propose Differentiable Automatic Data Augmentation (DADA) which dramatically reduces the cost.
We conduct extensive experiments on CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets.
Results show our DADA is at least one order of magnitude faster than the state-of-the-art while achieving very comparable accuracy.
arXiv Detail & Related papers (2020-03-08T13:23:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.