Pedestrian 3D Bounding Box Prediction
- URL: http://arxiv.org/abs/2206.14195v1
- Date: Tue, 28 Jun 2022 17:59:45 GMT
- Title: Pedestrian 3D Bounding Box Prediction
- Authors: Saeed Saadatnejad, Yi Zhou Ju, Alexandre Alahi
- Abstract summary: We focus on 3D bounding boxes, which are reasonable estimates of humans without modeling complex motion details for autonomous vehicles.
We suggest this new problem and present a simple yet effective model for pedestrians' 3D bounding box prediction.
This method follows an encoder-decoder architecture based on recurrent neural networks.
- Score: 83.7135926821794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safety is still the main issue of autonomous driving, and in order to be
globally deployed, they need to predict pedestrians' motions sufficiently in
advance. While there is a lot of research on coarse-grained (human center
prediction) and fine-grained predictions (human body keypoints prediction), we
focus on 3D bounding boxes, which are reasonable estimates of humans without
modeling complex motion details for autonomous vehicles. This gives the
flexibility to predict in longer horizons in real-world settings. We suggest
this new problem and present a simple yet effective model for pedestrians' 3D
bounding box prediction. This method follows an encoder-decoder architecture
based on recurrent neural networks, and our experiments show its effectiveness
in both the synthetic (JTA) and real-world (NuScenes) datasets. The learned
representation has useful information to enhance the performance of other
tasks, such as action anticipation. Our code is available online:
https://github.com/vita-epfl/bounding-box-prediction
Related papers
- Humanoid Locomotion as Next Token Prediction [84.21335675130021]
Our model is a causal transformer trained via autoregressive prediction of sensorimotor trajectories.
We show that our model enables a full-sized humanoid to walk in San Francisco zero-shot.
Our model can transfer to the real world even when trained on only 27 hours of walking data, and can generalize commands not seen during training like walking backward.
arXiv Detail & Related papers (2024-02-29T18:57:37Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction [63.3021778885906]
3D bounding boxes are a widespread intermediate representation in many computer vision applications.
We propose methods for leveraging our autoregressive model to make high confidence predictions and meaningful uncertainty measures.
We release a simulated dataset, COB-3D, which highlights new types of ambiguity that arise in real-world robotics applications.
arXiv Detail & Related papers (2022-10-13T23:57:40Z) - SLPC: a VRNN-based approach for stochastic lidar prediction and
completion in autonomous driving [63.87272273293804]
We propose a new LiDAR prediction framework that is based on generative models namely Variational Recurrent Neural Networks (VRNNs)
Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames.
We present a sparse version of VRNNs and an effective self-supervised training method that does not require any labels.
arXiv Detail & Related papers (2021-02-19T11:56:44Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Pedestrian Intention Prediction: A Multi-task Perspective [83.7135926821794]
In order to be globally deployed, autonomous cars must guarantee the safety of pedestrians.
This work tries to solve this problem by jointly predicting the intention and visual states of pedestrians.
The method is a recurrent neural network in a multi-task learning approach.
arXiv Detail & Related papers (2020-10-20T13:42:31Z) - A Real-Time Predictive Pedestrian Collision Warning Service for
Cooperative Intelligent Transportation Systems Using 3D Pose Estimation [10.652350454373531]
We propose a real-time predictive pedestrian collision warning service (P2CWS) for two tasks: pedestrian orientation recognition (100.53 FPS) and intention prediction (35.76 FPS)
Our framework obtains satisfying generalization over multiple sites because of the proposed site-independent features.
The proposed vision framework realizes 89.3% accuracy in the behavior recognition task on the TUD dataset without any training process.
arXiv Detail & Related papers (2020-09-23T00:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.