Related papers: Sample Efficient Interactive End-to-End Deep Learning for Self-Driving Cars with Selective Multi-Class Safe Dataset Aggregation

Sample Efficient Interactive End-to-End Deep Learning for Self-Driving Cars with Selective Multi-Class Safe Dataset Aggregation

URL: http://arxiv.org/abs/2007.14671v1
Date: Wed, 29 Jul 2020 08:38:00 GMT
Title: Sample Efficient Interactive End-to-End Deep Learning for Self-Driving Cars with Selective Multi-Class Safe Dataset Aggregation
Authors: Yunus Bicer, Ali Alizadeh, Nazim Kemal Ure, Ahmetcan Erdogan, and Orkun Kizilirmak
Abstract summary: End-to-end imitation learning is a popular method for computing self-driving car policies. Standard approach relies on collecting pairs of inputs (camera images) and outputs (steering angle, etc.) from an expert policy and fitting a deep neural network to this data to learn the driving policy.
Score: 0.13048920509133805
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The objective of this paper is to develop a sample efficient end-to-end deep learning method for self-driving cars, where we attempt to increase the value of the information extracted from samples, through careful analysis obtained from each call to expert driver\'s policy. End-to-end imitation learning is a popular method for computing self-driving car policies. The standard approach relies on collecting pairs of inputs (camera images) and outputs (steering angle, etc.) from an expert policy and fitting a deep neural network to this data to learn the driving policy. Although this approach had some successful demonstrations in the past, learning a good policy might require a lot of samples from the expert driver, which might be resource-consuming. In this work, we develop a novel framework based on the Safe Dateset Aggregation (safe DAgger) approach, where the current learned policy is automatically segmented into different trajectory classes, and the algorithm identifies trajectory segments or classes with the weak performance at each step. Once the trajectory segments with weak performance identified, the sampling algorithm focuses on calling the expert policy only on these segments, which improves the convergence rate. The presented simulation results show that the proposed approach can yield significantly better performance compared to the standard Safe DAgger algorithm while using the same amount of samples from the expert.

Related papers

Inconsistency-based Active Learning for LiDAR Object Detection [1.623951368574041]
Deep learning models for object detection in autonomous driving have recently achieved impressive performance gains. Current models require increasingly large datasets for training. Active learning is a promising approach that has been extensively researched in the image domain.
arXiv Detail & Related papers (2025-05-01T13:29:56Z)
Dense Policy: Bidirectional Autoregressive Learning of Actions [51.60428100831717]
This paper introduces a bidirectionally expanded learning approach, termed Dense Policy, to establish a new paradigm for autoregressive policies in action prediction. It employs a lightweight encoder-only architecture to iteratively unfold the action sequence from an initial single frame into the target sequence in a coarse-to-fine manner. Experiments validate that our dense policy has superior autoregressive learning capabilities and can surpass existing holistic generative policies.
arXiv Detail & Related papers (2025-03-17T14:28:08Z)
Learning to Drive by Imitating Surrounding Vehicles [0.6612847014373572]
Imitation learning is a promising approach for training autonomous vehicles to navigate complex traffic environments. We propose a data augmentation strategy that enhances imitation learning by leveraging the observed trajectories of nearby vehicles. We evaluate our approach using the state-of-the-art learning-based planning method PLUTO on the nuPlan dataset and demonstrate that our augmentation method leads to improved performance in complex driving scenarios.
arXiv Detail & Related papers (2025-03-08T00:40:47Z)
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z)
Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments. Our approach enhances LiDAR-based detection models using spatial quantized historical features. Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z)
Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation. We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z)
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving [100.3848723827869]
We present an effective multi-task framework, VE-Prompt, which introduces visual exemplars via task-specific prompting. Specifically, we generate visual exemplars based on bounding boxes and color-based markers, which provide accurate visual appearances of target categories. We bridge transformer-based encoders and convolutional layers for efficient and accurate unified perception in autonomous driving.
arXiv Detail & Related papers (2023-03-03T08:54:06Z)
A Semi-supervised Approach for Activity Recognition from Indoor Trajectory Data [0.822021749810331]
We consider the task of classifying the activities of moving objects from their noisy indoor trajectory data in a collaborative manufacturing environment. We present a semi-supervised machine learning approach that first applies an information theoretic criterion to partition a long trajectory into a set of segments. The segments are then labelled automatically based on a constrained hierarchical clustering method.
arXiv Detail & Related papers (2023-01-09T01:20:50Z)
Pushing the Limits of Learning-based Traversability Analysis for Autonomous Driving on CPU [1.841057463340778]
This paper proposes and evaluates a real-time machine learning-based Traversability Analysis method. We show that integrating a new set of geometric and visual features and focusing on important implementation details enables a noticeable boost in performance and reliability. The proposed approach has been compared with state-of-the-art Deep Learning approaches on a public dataset of outdoor driving scenarios.
arXiv Detail & Related papers (2022-06-07T07:57:34Z)
Learning Interactive Driving Policies via Data-driven Simulation [125.97811179463542]
Data-driven simulators promise high data-efficiency for driving policy learning. Small underlying datasets often lack interesting and challenging edge cases for learning interactive driving. We propose a simulation method that uses in-painted ado vehicles for learning robust driving policies.
arXiv Detail & Related papers (2021-11-23T20:14:02Z)
Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning with Application to Autonomous Driving [35.44498286245894]
We present an efficient sampling-based maximum-entropy inverse reinforcement learning (IRL) algorithm in this paper. We evaluate the proposed algorithm on real driving data, including both non-interactive and interactive scenarios.
arXiv Detail & Related papers (2020-06-22T01:41:13Z)
Fast Template Matching and Update for Video Object Tracking and Segmentation [56.465510428878]
The main task we aim to tackle is the multi-instance semi-supervised video object segmentation across a sequence of frames. The challenges lie in the selection of the matching method to predict the result as well as to decide whether to update the target template. We propose a novel approach which utilizes reinforcement learning to make these two decisions at the same time.
arXiv Detail & Related papers (2020-04-16T08:58:45Z)
Lane-Merging Using Policy-based Reinforcement Learning and Post-Optimization [0.0]
We combine policy-based reinforcement learning with local optimization to foster and synthesize the best of the two methodologies. We evaluate the proposed method using lane-change scenarios with a varying number of vehicles.
arXiv Detail & Related papers (2020-03-06T12:57:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.