Shared Cross-Modal Trajectory Prediction for Autonomous Driving
- URL: http://arxiv.org/abs/2004.00202v3
- Date: Fri, 11 Jun 2021 21:24:38 GMT
- Title: Shared Cross-Modal Trajectory Prediction for Autonomous Driving
- Authors: Chiho Choi, Joon Hee Choi, Srikanth Malla, Jiachen Li
- Abstract summary: We propose a Cross-Modal Embedding framework that aims to benefit from the use of multiple input modalities.
An extensive evaluation is conducted to show the efficacy of the proposed framework using two benchmark driving datasets.
- Score: 24.07872495811019
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Predicting future trajectories of traffic agents in highly interactive
environments is an essential and challenging problem for the safe operation of
autonomous driving systems. On the basis of the fact that self-driving vehicles
are equipped with various types of sensors (e.g., LiDAR scanner, RGB camera,
radar, etc.), we propose a Cross-Modal Embedding framework that aims to benefit
from the use of multiple input modalities. At training time, our model learns
to embed a set of complementary features in a shared latent space by jointly
optimizing the objective functions across different types of input data. At
test time, a single input modality (e.g., LiDAR data) is required to generate
predictions from the input perspective (i.e., in the LiDAR space), while taking
advantages from the model trained with multiple sensor modalities. An extensive
evaluation is conducted to show the efficacy of the proposed framework using
two benchmark driving datasets.
Related papers
- Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving [38.28159034562901]
Reason2Drive is a benchmark dataset with over 600K video-text pairs.
We characterize the autonomous driving process as a sequential combination of perception, prediction, and reasoning steps.
We introduce a novel aggregated evaluation metric to assess chain-based reasoning performance in autonomous systems.
arXiv Detail & Related papers (2023-12-06T18:32:33Z) - UnLoc: A Universal Localization Method for Autonomous Vehicles using
LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions.
Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z) - End-to-end Autonomous Driving: Challenges and Frontiers [45.391430626264764]
We provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving.
We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others.
We discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework.
arXiv Detail & Related papers (2023-06-29T14:17:24Z) - Domain Knowledge Driven Pseudo Labels for Interpretable Goal-Conditioned
Interactive Trajectory Prediction [29.701029725302586]
We study the joint trajectory prediction problem with the goal-conditioned framework.
We introduce a conditional-variational-autoencoder-based (CVAE) model to explicitly encode different interaction modes into the latent space.
We propose a novel approach to avoid KL vanishing and induce an interpretable interactive latent space with pseudo labels.
arXiv Detail & Related papers (2022-03-28T21:41:21Z) - Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction [71.97877759413272]
Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions.
Recent methods have achieved strong performances using Multi-Choice Learning objectives like winner-takes-all (WTA) or best-of-many.
Our work addresses two key challenges in trajectory prediction, learning outputs, and better predictions by imposing constraints using driving knowledge.
arXiv Detail & Related papers (2021-04-16T17:58:56Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment.
Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Shared Cross-Modal Trajectory Prediction for Autonomous Driving [24.07872495811019]
We propose a Cross-Modal Embedding framework that aims to benefit from the use of multiple input modalities.
An extensive evaluation is conducted to show the efficacy of the proposed framework using two benchmark driving datasets.
arXiv Detail & Related papers (2020-11-15T07:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.