Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud
Forecasting for Sequential Pose Forecasting
- URL: http://arxiv.org/abs/2003.08376v3
- Date: Sat, 7 Nov 2020 02:48:22 GMT
- Title: Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud
Forecasting for Sequential Pose Forecasting
- Authors: Xinshuo Weng and Jianren Wang and Sergey Levine and Kris Kitani and
Nicholas Rhinehart
- Abstract summary: Self-driving vehicles and robotic manipulation systems often forecast future object poses by first detecting and tracking objects.
This detect-then-forecast pipeline is expensive to scale, as pose forecasting algorithms typically require labeled sequences of object poses.
We propose to first forecast 3D sensor data and then detect/track objects on the predicted point cloud sequences to obtain future poses.
This makes it less expensive to scale pose forecasting, as the sensor data forecasting task requires no labels.
- Score: 106.3504366501894
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many autonomous systems forecast aspects of the future in order to aid
decision-making. For example, self-driving vehicles and robotic manipulation
systems often forecast future object poses by first detecting and tracking
objects. However, this detect-then-forecast pipeline is expensive to scale, as
pose forecasting algorithms typically require labeled sequences of object
poses, which are costly to obtain in 3D space. Can we scale performance without
requiring additional labels? We hypothesize yes, and propose inverting the
detect-then-forecast pipeline. Instead of detecting, tracking and then
forecasting the objects, we propose to first forecast 3D sensor data (e.g.,
point clouds with $100$k points) and then detect/track objects on the predicted
point cloud sequences to obtain future poses, i.e., a forecast-then-detect
pipeline. This inversion makes it less expensive to scale pose forecasting, as
the sensor data forecasting task requires no labels. Part of this work's focus
is on the challenging first step -- Sequential Pointcloud Forecasting (SPF),
for which we also propose an effective approach, SPFNet. To compare our
forecast-then-detect pipeline relative to the detect-then-forecast pipeline, we
propose an evaluation procedure and two metrics. Through experiments on a
robotic manipulation dataset and two driving datasets, we show that SPFNet is
effective for the SPF task, our forecast-then-detect pipeline outperforms the
detect-then-forecast approaches to which we compared, and that pose forecasting
performance improves with the addition of unlabeled data.
Related papers
- DeTra: A Unified Model for Object Detection and Trajectory Forecasting [68.85128937305697]
Our approach formulates the union of the two tasks as a trajectory refinement problem.
To tackle this unified task, we design a refinement transformer that infers the presence, pose, and multi-modal future behaviors of objects.
In our experiments, we observe that ourmodel outperforms the state-of-the-art on Argoverse 2 Sensor and Open dataset.
arXiv Detail & Related papers (2024-06-06T18:12:04Z) - Learning Temporal Cues by Predicting Objects Move for Multi-camera 3D Object Detection [9.053936905556204]
We propose a model called DAP (Detection After Prediction), consisting of a two-branch network.
The features predicting the current objects from branch (i) is fused into branch (ii) to transfer predictive knowledge.
Our model can be used plug-and-play, showing consistent performance gain.
arXiv Detail & Related papers (2024-04-02T02:20:47Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting.
We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them.
We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z) - Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting [58.45661235893729]
One promising self-supervised task is 3D point cloud forecasting from unannotated LiDAR sequences.
We show that this task requires algorithms to implicitly capture (1) sensor extrinsics (i.e., the egomotion of the autonomous vehicle), (2) sensor intrinsics (i.e., the sampling pattern specific to the particular LiDAR sensor), and (3) the shape and motion of other objects in the scene.
We render point cloud data from 4D occupancy predictions given sensor extrinsics and intrinsics, allowing one to train and test occupancy algorithms with unannotated LiDAR sequences.
arXiv Detail & Related papers (2023-02-25T18:12:37Z) - Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking with
Transformer [62.68401838976208]
3D object tracking aims to predict the location and orientation of an object in consecutive frames given an object template.
Motivated by the success of transformers, we propose Point Tracking TRansformer (PTTR), which efficiently predicts high-quality 3D tracking results.
arXiv Detail & Related papers (2022-08-10T08:36:46Z) - Forecasting from LiDAR via Future Object Detection [47.11167997187244]
We propose an end-to-end approach for detection and motion forecasting based on raw sensor measurement.
By linking future and current locations in a many-to-one manner, our approach is able to reason about multiple futures.
arXiv Detail & Related papers (2022-03-30T13:40:28Z) - FlashP: An Analytical Pipeline for Real-time Forecasting of Time-Series
Relational Data [31.29499654765994]
Real-time forecasting can be conducted in two steps: first, we specify the part of data to be focused on and the measure to be predicted by slicing, dicing, and aggregating the data.
A natural idea is to utilize sampling to obtain approximate aggregations in real time as the input to train the forecasting model.
We introduce a new sampling scheme, called GSW sampling, and analyze error bounds for estimating aggregations using GSW samples.
arXiv Detail & Related papers (2021-01-09T06:23:13Z) - STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and
Trajectory Prediction [24.855059537779294]
We present a novel end-to-end two-stage network: Spatio--Interactive Network (STINet)
In addition to 3D geometry of pedestrians, we model temporal information for each of the pedestrians.
Our method predicts both current and past locations in the first stage, so that each pedestrian can be linked across frames.
arXiv Detail & Related papers (2020-05-08T18:43:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.