Valeo4Cast: A Modular Approach to End-to-End Forecasting
- URL: http://arxiv.org/abs/2406.08113v1
- Date: Wed, 12 Jun 2024 11:50:51 GMT
- Title: Valeo4Cast: A Modular Approach to End-to-End Forecasting
- Authors: Yihong Xu, Éloi Zablocki, Alexandre Boulch, Gilles Puy, Mickael Chen, Florent Bartoccioni, Nermin Samet, Oriane Siméoni, Spyros Gidaris, Tuan-Hung Vu, Andrei Bursuc, Eduardo Valle, Renaud Marlet, Matthieu Cord,
- Abstract summary: We individually build and train detection, tracking, and forecasting modules.
We then only use consecutive finetuning steps to integrate the modules better and alleviate compounding errors.
Our solution ranks first in the Argoverse 2 end-to-end Forecasting Challenge held at CVPR 2024 Workshop on Autonomous Driving.
- Score: 93.86257326005726
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Motion forecasting is crucial in autonomous driving systems to anticipate the future trajectories of surrounding agents such as pedestrians, vehicles, and traffic signals. In end-to-end forecasting, the model must jointly detect from sensor data (cameras or LiDARs) the position and past trajectories of the different elements of the scene and predict their future location. We depart from the current trend of tackling this task via end-to-end training from perception to forecasting and we use a modular approach instead. Following a recent study, we individually build and train detection, tracking, and forecasting modules. We then only use consecutive finetuning steps to integrate the modules better and alleviate compounding errors. Our study reveals that this simple yet effective approach significantly improves performance on the end-to-end forecasting benchmark. Consequently, our solution ranks first in the Argoverse 2 end-to-end Forecasting Challenge held at CVPR 2024 Workshop on Autonomous Driving (WAD), with 63.82 mAPf. We surpass forecasting results by +17.1 points over last year's winner and by +13.3 points over this year's runner-up. This remarkable performance in forecasting can be explained by our modular paradigm, which integrates finetuning strategies and significantly outperforms the end-to-end-trained counterparts.
Related papers
- Enhancing End-to-End Autonomous Driving with Latent World Model [78.22157677787239]
We propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels.
Our framework textbfLAW uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame.
As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.
arXiv Detail & Related papers (2024-06-12T17:59:21Z) - Towards Motion Forecasting with Real-World Perception Inputs: Are
End-to-End Approaches Competitive? [93.10694819127608]
We propose a unified evaluation pipeline for forecasting methods with real-world perception inputs.
Our in-depth study uncovers a substantial performance gap when transitioning from curated to perception-based data.
arXiv Detail & Related papers (2023-06-15T17:03:14Z) - Control-Aware Prediction Objectives for Autonomous Driving [78.19515972466063]
We present control-aware prediction objectives (CAPOs) to evaluate the downstream effect of predictions on control without requiring the planner be differentiable.
We propose two types of importance weights that weight the predictive likelihood: one using an attention model between agents, and another based on control variation when exchanging predicted trajectories for ground truth trajectories.
arXiv Detail & Related papers (2022-04-28T07:37:21Z) - Forecasting from LiDAR via Future Object Detection [47.11167997187244]
We propose an end-to-end approach for detection and motion forecasting based on raw sensor measurement.
By linking future and current locations in a many-to-one manner, our approach is able to reason about multiple futures.
arXiv Detail & Related papers (2022-03-30T13:40:28Z) - Sliding Sequential CVAE with Time Variant Socially-aware Rethinking for
Trajectory Prediction [13.105275905781632]
Pedestrian trajectory prediction is a key technology in many applications such as video surveillance, social robot navigation, and autonomous driving.
This work proposes a novel trajectory prediction method called CSR, which consists of a cascaded conditional autoencoder (CVAE) module and a socially-aware regression module.
Experiments results demonstrate that the proposed method exhibits improvements over state-of-the-art method on the Stanford Drone dataset.
arXiv Detail & Related papers (2021-10-28T10:56:21Z) - The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules.
In this paper we propose to incorporate structured priors as a loss function.
We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z) - PnPNet: End-to-End Perception and Prediction with Tracking in the Loop [82.97006521937101]
We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles.
We propose Net, an end-to-end model that takes as input sensor data, and outputs at each time step object tracks and their future level.
arXiv Detail & Related papers (2020-05-29T17:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.