PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point
Tracking
- URL: http://arxiv.org/abs/2307.15055v1
- Date: Thu, 27 Jul 2023 17:58:11 GMT
- Title: PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point
Tracking
- Authors: Yang Zheng and Adam W. Harley and Bokui Shen and Gordon Wetzstein and
Leonidas J. Guibas
- Abstract summary: We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework.
Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion.
We animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos.
- Score: 90.29143475328506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce PointOdyssey, a large-scale synthetic dataset, and data
generation framework, for the training and evaluation of long-term fine-grained
tracking algorithms. Our goal is to advance the state-of-the-art by placing
emphasis on long videos with naturalistic motion. Toward the goal of
naturalism, we animate deformable characters using real-world motion capture
data, we build 3D scenes to match the motion capture environments, and we
render camera viewpoints using trajectories mined via structure-from-motion on
real videos. We create combinatorial diversity by randomizing character
appearance, motion profiles, materials, lighting, 3D assets, and atmospheric
effects. Our dataset currently includes 104 videos, averaging 2,000 frames
long, with orders of magnitude more correspondence annotations than prior work.
We show that existing methods can be trained from scratch in our dataset and
outperform the published variants. Finally, we introduce modifications to the
PIPs point tracking method, greatly widening its temporal receptive field,
which improves its performance on PointOdyssey as well as on two real-world
benchmarks. Our data and code are publicly available at:
https://pointodyssey.com
Related papers
- TAPVid-3D: A Benchmark for Tracking Any Point in 3D [63.060421798990845]
We introduce a new benchmark, TAPVid-3D, for evaluating the task of Tracking Any Point in 3D.
This benchmark will serve as a guidepost to improve our ability to understand precise 3D motion and surface deformation from monocular video.
arXiv Detail & Related papers (2024-07-08T13:28:47Z) - Fast Encoder-Based 3D from Casual Videos via Point Track Processing [22.563073026889324]
We present TracksTo4D, a learning-based approach that enables inferring 3D structure and camera positions from dynamic content originating from casual videos.
TracksTo4D is trained in an unsupervised way on a dataset of casual videos.
Experiments show that TracksTo4D can reconstruct a temporal point cloud and camera positions of the underlying video with accuracy comparable to state-of-the-art methods.
arXiv Detail & Related papers (2024-04-10T15:37:00Z) - Tracking by 3D Model Estimation of Unknown Objects in Videos [122.56499878291916]
We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation.
Our representation tackles a complex long-term dense correspondence problem between all 3D points on the object for all video frames.
The proposed optimization minimizes a novel loss function to estimate the best 3D shape, texture, and 6DoF pose.
arXiv Detail & Related papers (2023-04-13T11:32:36Z) - TAP-Vid: A Benchmark for Tracking Any Point in a Video [84.94877216665793]
We formalize the problem of tracking arbitrary physical points on surfaces over longer video clips, naming it tracking any point (TAP)
We introduce a companion benchmark, TAP-Vid, which is composed of both real-world videos with accurate human annotations of point tracks, and synthetic videos with perfect ground-truth point tracks.
We propose a simple end-to-end point tracking model TAP-Net, showing that it outperforms all prior methods on our benchmark when trained on synthetic data.
arXiv Detail & Related papers (2022-11-07T17:57:02Z) - SpOT: Spatiotemporal Modeling for 3D Object Tracking [68.12017780034044]
3D multi-object tracking aims to consistently identify all mobile time.
Current 3D tracking methods rely on abstracted information and limited history.
We develop a holistic representation of scenes that leverage both spatial and temporal information.
arXiv Detail & Related papers (2022-07-12T21:45:49Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.