DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World
Videos
- URL: http://arxiv.org/abs/2312.09523v1
- Date: Fri, 15 Dec 2023 04:06:52 GMT
- Title: DriveTrack: A Benchmark for Long-Range Point Tracking in Real-World
Videos
- Authors: Arjun Balasingam, Joseph Chandler, Chenning Li, Zhoutong Zhang, Hari
Balakrishnan
- Abstract summary: DriveTrack is a new benchmark and data generation framework for keypoint tracking in real-world videos.
We release a dataset consisting of 1 billion point tracks across 24 hours of video, which is seven orders of magnitude greater than prior real-world benchmarks.
We show that fine-tuning keypoint trackers on DriveTrack improves accuracy on real-world scenes by up to 7%.
- Score: 9.304179915575114
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper presents DriveTrack, a new benchmark and data generation framework
for long-range keypoint tracking in real-world videos. DriveTrack is motivated
by the observation that the accuracy of state-of-the-art trackers depends
strongly on visual attributes around the selected keypoints, such as texture
and lighting. The problem is that these artifacts are especially pronounced in
real-world videos, but these trackers are unable to train on such scenes due to
a dearth of annotations. DriveTrack bridges this gap by building a framework to
automatically annotate point tracks on autonomous driving datasets. We release
a dataset consisting of 1 billion point tracks across 24 hours of video, which
is seven orders of magnitude greater than prior real-world benchmarks and on
par with the scale of synthetic benchmarks. DriveTrack unlocks new use cases
for point tracking in real-world videos. First, we show that fine-tuning
keypoint trackers on DriveTrack improves accuracy on real-world scenes by up to
7%. Second, we analyze the sensitivity of trackers to visual artifacts in real
scenes and motivate the idea of running assistive keypoint selectors alongside
trackers.
Related papers
- PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point
Tracking [90.29143475328506]
We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework.
Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion.
We animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos.
arXiv Detail & Related papers (2023-07-27T17:58:11Z) - Tracking Anything in High Quality [63.63653185865726]
HQTrack is a framework for High Quality Tracking anything in videos.
It consists of a video multi-object segmenter (VMOS) and a mask refiner (MR)
arXiv Detail & Related papers (2023-07-26T06:19:46Z) - CoTracker: It is Better to Track Together [70.63040730154984]
CoTracker is a transformer-based model that tracks a large number of 2D points in long video sequences.
We show that joint tracking significantly improves tracking accuracy and robustness, and allows CoTracker to track occluded points and points outside of the camera view.
arXiv Detail & Related papers (2023-07-14T21:13:04Z) - TAP-Vid: A Benchmark for Tracking Any Point in a Video [84.94877216665793]
We formalize the problem of tracking arbitrary physical points on surfaces over longer video clips, naming it tracking any point (TAP)
We introduce a companion benchmark, TAP-Vid, which is composed of both real-world videos with accurate human annotations of point tracks, and synthetic videos with perfect ground-truth point tracks.
We propose a simple end-to-end point tracking model TAP-Net, showing that it outperforms all prior methods on our benchmark when trained on synthetic data.
arXiv Detail & Related papers (2022-11-07T17:57:02Z) - LightTrack: Finding Lightweight Neural Networks for Object Tracking via
One-Shot Architecture Search [104.84999119090887]
We present LightTrack, which uses neural architecture search (NAS) to design more lightweight and efficient object trackers.
Comprehensive experiments show that our LightTrack is effective.
It can find trackers that achieve superior performance compared to handcrafted SOTA trackers, such as SiamRPN++ and Ocean.
arXiv Detail & Related papers (2021-04-29T17:55:24Z) - Benchmarking Deep Trackers on Aerial Videos [5.414308305392762]
In this paper, we compare ten trackers based on deep learning techniques on four aerial datasets.
We choose top performing trackers utilizing different approaches, specifically tracking by detection, discriminative correlation filters, Siamese networks and reinforcement learning.
Our findings indicate that the trackers perform significantly worse in aerial datasets compared to standard ground level videos.
arXiv Detail & Related papers (2021-03-24T01:45:19Z) - Tracking Objects as Points [83.9217787335878]
We present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art.
Our tracker, CenterTrack, applies a detection model to a pair of images and detections from the prior frame.
CenterTrack is simple, online (no peeking into the future), and real-time.
arXiv Detail & Related papers (2020-04-02T17:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.