Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting
- URL: http://arxiv.org/abs/2302.13130v3
- Date: Sun, 30 Apr 2023 20:19:38 GMT
- Title: Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting
- Authors: Tarasha Khurana, Peiyun Hu, David Held, Deva Ramanan
- Abstract summary: One promising self-supervised task is 3D point cloud forecasting from unannotated LiDAR sequences.
We show that this task requires algorithms to implicitly capture (1) sensor extrinsics (i.e., the egomotion of the autonomous vehicle), (2) sensor intrinsics (i.e., the sampling pattern specific to the particular LiDAR sensor), and (3) the shape and motion of other objects in the scene.
We render point cloud data from 4D occupancy predictions given sensor extrinsics and intrinsics, allowing one to train and test occupancy algorithms with unannotated LiDAR sequences.
- Score: 58.45661235893729
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting how the world can evolve in the future is crucial for motion
planning in autonomous systems. Classical methods are limited because they rely
on costly human annotations in the form of semantic class labels, bounding
boxes, and tracks or HD maps of cities to plan their motion and thus are
difficult to scale to large unlabeled datasets. One promising self-supervised
task is 3D point cloud forecasting from unannotated LiDAR sequences. We show
that this task requires algorithms to implicitly capture (1) sensor extrinsics
(i.e., the egomotion of the autonomous vehicle), (2) sensor intrinsics (i.e.,
the sampling pattern specific to the particular LiDAR sensor), and (3) the
shape and motion of other objects in the scene. But autonomous systems should
make predictions about the world and not their sensors. To this end, we factor
out (1) and (2) by recasting the task as one of spacetime (4D) occupancy
forecasting. But because it is expensive to obtain ground-truth 4D occupancy,
we render point cloud data from 4D occupancy predictions given sensor
extrinsics and intrinsics, allowing one to train and test occupancy algorithms
with unannotated LiDAR sequences. This also allows one to evaluate and compare
point cloud forecasting algorithms across diverse datasets, sensors, and
vehicles.
Related papers
- Robust 3D Semantic Occupancy Prediction with Calibration-free Spatial Transformation [32.50849425431012]
For autonomous cars equipped with multi-camera and LiDAR, it is critical to aggregate multi-sensor information into a unified 3D space for accurate and robust predictions.
Recent methods are mainly built on the 2D-to-3D transformation that relies on sensor calibration to project the 2D image information into the 3D space.
In this work, we propose a calibration-free spatial transformation based on vanilla attention to implicitly model the spatial correspondence.
arXiv Detail & Related papers (2024-11-19T02:40:42Z) - TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction [0.0]
Trajectory prediction is a critical component of autonomous driving systems.
This paper introduces TrajectoryNAS, a pioneering method that focuses on utilizing point cloud data for trajectory prediction.
arXiv Detail & Related papers (2024-03-18T11:48:41Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Argoverse 2: Next Generation Datasets for Self-Driving Perception and
Forecasting [64.7364925689825]
Argoverse 2 (AV2) is a collection of three datasets for perception and forecasting research in the self-driving domain.
The Lidar dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose.
The Motion Forecasting dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene.
arXiv Detail & Related papers (2023-01-02T00:36:22Z) - Receding Moving Object Segmentation in 3D LiDAR Data Using Sparse 4D
Convolutions [33.538055872850514]
We tackle the problem of distinguishing 3D LiDAR points that belong to currently moving objects, like walking pedestrians or driving cars, from points that are obtained from non-moving objects, like walls but also parked cars.
Our approach takes a sequence of observed LiDAR scans and turns them into a voxelized sparse 4D point cloud.
We apply computationally efficient sparse 4D convolutions to jointly extract spatial and temporal features and predict moving object confidence scores for all points in the sequence.
arXiv Detail & Related papers (2022-06-08T18:51:14Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z) - IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment.
Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud
Forecasting for Sequential Pose Forecasting [106.3504366501894]
Self-driving vehicles and robotic manipulation systems often forecast future object poses by first detecting and tracking objects.
This detect-then-forecast pipeline is expensive to scale, as pose forecasting algorithms typically require labeled sequences of object poses.
We propose to first forecast 3D sensor data and then detect/track objects on the predicted point cloud sequences to obtain future poses.
This makes it less expensive to scale pose forecasting, as the sensor data forecasting task requires no labels.
arXiv Detail & Related papers (2020-03-18T17:54:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.