Related papers: Learning Interpretable End-to-End Vision-Based Motion Planning for Autonomous Driving with Optical Flow Distillation

Learning Interpretable End-to-End Vision-Based Motion Planning for Autonomous Driving with Optical Flow Distillation

URL: http://arxiv.org/abs/2104.12861v1
Date: Sun, 18 Apr 2021 13:51:25 GMT
Title: Learning Interpretable End-to-End Vision-Based Motion Planning for Autonomous Driving with Optical Flow Distillation
Authors: Hengli Wang, Peide Cai, Yuxiang Sun, Lujia Wang, Ming Liu
Abstract summary: IVMP is an interpretable end-to-end vision-based motion planning approach for autonomous driving. We develop an optical flow distillation paradigm, which can effectively enhance the network while still maintaining its real-time performance. Our IVMP significantly outperforms the state-of-the-art approaches in imitating human drivers with a much higher success rate.
Score: 11.638798976654327
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, deep-learning based approaches have achieved impressive performance for autonomous driving. However, end-to-end vision-based methods typically have limited interpretability, making the behaviors of the deep networks difficult to explain. Hence, their potential applications could be limited in practice. To address this problem, we propose an interpretable end-to-end vision-based motion planning approach for autonomous driving, referred to as IVMP. Given a set of past surrounding-view images, our IVMP first predicts future egocentric semantic maps in bird's-eye-view space, which are then employed to plan trajectories for self-driving vehicles. The predicted future semantic maps not only provide useful interpretable information, but also allow our motion planning module to handle objects with low probability, thus improving the safety of autonomous driving. Moreover, we also develop an optical flow distillation paradigm, which can effectively enhance the network while still maintaining its real-time performance. Extensive experiments on the nuScenes dataset and closed-loop simulation show that our IVMP significantly outperforms the state-of-the-art approaches in imitating human drivers with a much higher success rate. Our project page is available at https://sites.google.com/view/ivmp.

Related papers

ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving [35.493857028919685]
We propose ReCogDrive, an autonomous driving system that integrates Vision-Language Models with diffusion planner.<n>In this paper, we use a large-scale driving question-answering datasets to train the VLMs, mitigating the domain discrepancy between generic content and real-world driving scenarios.<n>In the second stage, we employ a diffusion-based planner to perform imitation learning, mapping representations from the latent language space to continuous driving actions.
arXiv Detail & Related papers (2025-06-09T03:14:04Z)
Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving [20.33096710167997]
generative planning with 3D-vision language pre-training model named GPVL is proposed for end-to-end autonomous driving. Cross-modal language model is introduced to generate holistic driving decisions and fine-grained trajectories. It is believed that the effective, robust and efficient performance of GPVL is crucial for the practical application of future autonomous driving systems.
arXiv Detail & Related papers (2025-01-15T15:20:46Z)
VLM-AD: End-to-End Autonomous Driving through Vision-Language Model Supervision [20.43366384946928]
Vision-language models (VLMs) as teachers to enhance training. VLM-AD achieves significant improvements in planning accuracy and reduced collision rates on the nuScenes dataset.
arXiv Detail & Related papers (2024-12-19T01:53:36Z)
DiFSD: Ego-Centric Fully Sparse Paradigm with Uncertainty Denoising and Iterative Refinement for Efficient End-to-End Self-Driving [55.53171248839489]
We propose an ego-centric fully sparse paradigm, named DiFSD, for end-to-end self-driving. Specifically, DiFSD mainly consists of sparse perception, hierarchical interaction and iterative motion planner. Experiments conducted on nuScenes and Bench2Drive datasets demonstrate the superior planning performance and great efficiency of DiFSD.
arXiv Detail & Related papers (2024-09-15T15:55:24Z)
A Cognitive-Based Trajectory Prediction Approach for Autonomous Driving [21.130543517747995]
This paper introduces the Human-Like Trajectory Prediction (H) model, which adopts a teacher-student knowledge distillation framework. The "teacher" model mimics the visual processing of the human brain, particularly the functions of the occipital and temporal lobes. The "student" model focuses on real-time interaction and decision-making, capturing essential perceptual cues for accurate prediction.
arXiv Detail & Related papers (2024-02-29T15:22:26Z)
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving [56.381918362410175]
Drive-WM is the first driving world model compatible with existing end-to-end planning models. Our model generates high-fidelity multiview videos in driving scenes.
arXiv Detail & Related papers (2023-11-29T18:59:47Z)
Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z)
KARNet: Kalman Filter Augmented Recurrent Neural Network for Learning World Models in Autonomous Driving Tasks [11.489187712465325]
We present a Kalman filter augmented recurrent neural network architecture to learn the latent representation of the traffic flow using front camera images only. Results show that incorporating an explicit model of the vehicle (states estimated using Kalman filtering) in the end-to-end learning significantly increases performance.
arXiv Detail & Related papers (2023-05-24T02:27:34Z)
ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning [132.20119288212376]
We propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously. To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system.
arXiv Detail & Related papers (2022-07-15T16:57:43Z)
Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving [104.32241082170044]
We study a new task, safety-aware motion prediction with unseen vehicles for autonomous driving. Unlike the existing trajectory prediction task for seen vehicles, we aim at predicting an occupancy map. Our approach is the first one that can predict the existence of unseen vehicles in most cases.
arXiv Detail & Related papers (2021-09-03T13:33:33Z)
End-to-End Interactive Prediction and Planning with Optical Flow Distillation for Autonomous Driving [16.340715765227475]
We propose an end-to-end interactive neural motion planner (INMP) for autonomous driving in this paper. Our INMP first generates a feature map in bird's-eye-view space, which is then processed to detect other agents and perform interactive prediction and planning jointly. Also, we adopt an optical flow distillation paradigm, which can effectively improve the network performance while still maintaining its real-time inference speed.
arXiv Detail & Related papers (2021-04-18T14:05:18Z)
The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules. In this paper we propose to incorporate structured priors as a loss function. We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
PLOP: Probabilistic poLynomial Objects trajectory Planning for autonomous driving [8.105493956485583]
We use a conditional imitation learning algorithm to predict trajectories for ego vehicle and its neighbors. Our approach is computationally efficient and relies only on on-board sensors. We evaluate our method offline on the publicly available dataset nuScenes.
arXiv Detail & Related papers (2020-03-09T16:55:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.