SpOT: Spatiotemporal Modeling for 3D Object Tracking
- URL: http://arxiv.org/abs/2207.05856v1
- Date: Tue, 12 Jul 2022 21:45:49 GMT
- Title: SpOT: Spatiotemporal Modeling for 3D Object Tracking
- Authors: Colton Stearns, Davis Rempe, Jie Li, Rares Ambrus, Sergey Zakharov,
Vitor Guizilini, Yanchao Yang, Leonidas J Guibas
- Abstract summary: 3D multi-object tracking aims to consistently identify all mobile time.
Current 3D tracking methods rely on abstracted information and limited history.
We develop a holistic representation of scenes that leverage both spatial and temporal information.
- Score: 68.12017780034044
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D multi-object tracking aims to uniquely and consistently identify all
mobile entities through time. Despite the rich spatiotemporal information
available in this setting, current 3D tracking methods primarily rely on
abstracted information and limited history, e.g. single-frame object bounding
boxes. In this work, we develop a holistic representation of traffic scenes
that leverages both spatial and temporal information of the actors in the
scene. Specifically, we reformulate tracking as a spatiotemporal problem by
representing tracked objects as sequences of time-stamped points and bounding
boxes over a long temporal history. At each timestamp, we improve the location
and motion estimates of our tracked objects through learned refinement over the
full sequence of object history. By considering time and space jointly, our
representation naturally encodes fundamental physical priors such as object
permanence and consistency across time. Our spatiotemporal tracking framework
achieves state-of-the-art performance on the Waymo and nuScenes benchmarks.
Related papers
- Delving into Motion-Aware Matching for Monocular 3D Object Tracking [81.68608983602581]
We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking.
We propose MoMA-M3T, a framework that mainly consists of three motion-aware components.
We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
arXiv Detail & Related papers (2023-08-22T17:53:58Z) - PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point
Tracking [90.29143475328506]
We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework.
Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion.
We animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos.
arXiv Detail & Related papers (2023-07-27T17:58:11Z) - TrajectoryFormer: 3D Object Tracking Transformer with Predictive
Trajectory Hypotheses [51.60422927416087]
3D multi-object tracking (MOT) is vital for many applications including autonomous driving vehicles and service robots.
We present TrajectoryFormer, a novel point-cloud-based 3D MOT framework.
arXiv Detail & Related papers (2023-06-09T13:31:50Z) - 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature
Correlation [0.0]
3D-FCT is a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking.
Our proposed method is evaluated on the KITTI tracking dataset where it is shown to provide an improvement of 5.57% mAP over a state-of-the-art approach.
arXiv Detail & Related papers (2021-10-06T06:36:29Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Tracking from Patterns: Learning Corresponding Patterns in Point Clouds
for 3D Object Tracking [34.40019455462043]
We propose to learn 3D object correspondences from temporal point cloud data and infer the motion information from correspondence patterns.
Our method exceeds the existing 3D tracking methods on both the KITTI and larger scale Nuscenes dataset.
arXiv Detail & Related papers (2020-10-20T06:07:20Z) - A Graph Attention Spatio-temporal Convolutional Network for 3D Human
Pose Estimation in Video [7.647599484103065]
We improve the learning of constraints in human skeleton by modeling local global spatial information via attention mechanisms.
Our approach effectively mitigates depth ambiguity and self-occlusion, generalizes to half upper body estimation, and achieves competitive performance on 2D-to-3D video pose estimation.
arXiv Detail & Related papers (2020-03-11T14:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.