Learning to Track Objects from Unlabeled Videos
- URL: http://arxiv.org/abs/2108.12711v1
- Date: Sat, 28 Aug 2021 22:10:06 GMT
- Title: Learning to Track Objects from Unlabeled Videos
- Authors: Jilai Zheng, Chao Ma, Houwen Peng and Xiaokang Yang
- Abstract summary: In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch.
To narrow the gap between unsupervised trackers and supervised counterparts, we propose an effective unsupervised learning approach composed of three stages.
Experiments show that the proposed USOT learned from unlabeled videos performs well over the state-of-the-art unsupervised trackers by large margins.
- Score: 63.149201681380305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose to learn an Unsupervised Single Object Tracker
(USOT) from scratch. We identify that three major challenges, i.e., moving
object discovery, rich temporal variation exploitation, and online update, are
the central causes of the performance bottleneck of existing unsupervised
trackers. To narrow the gap between unsupervised trackers and supervised
counterparts, we propose an effective unsupervised learning approach composed
of three stages. First, we sample sequentially moving objects with unsupervised
optical flow and dynamic programming, instead of random cropping. Second, we
train a naive Siamese tracker from scratch using single-frame pairs. Third, we
continue training the tracker with a novel cycle memory learning scheme, which
is conducted in longer temporal spans and also enables our tracker to update
online. Extensive experiments show that the proposed USOT learned from
unlabeled videos performs well over the state-of-the-art unsupervised trackers
by large margins, and on par with recent supervised deep trackers. Code is
available at https://github.com/VISION-SJTU/USOT.
Related papers
- Tracking with Human-Intent Reasoning [64.69229729784008]
This work proposes a new tracking task -- Instruction Tracking.
It involves providing implicit tracking instructions that require the trackers to perform tracking automatically in video frames.
TrackGPT is capable of performing complex reasoning-based tracking.
arXiv Detail & Related papers (2023-12-29T03:22:18Z) - An Effective Motion-Centric Paradigm for 3D Single Object Tracking in
Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z) - AttTrack: Online Deep Attention Transfer for Multi-object Tracking [4.5116674432168615]
Multi-object tracking (MOT) is a vital component of intelligent video analytics applications such as surveillance and autonomous driving.
In this paper, we aim to accelerate MOT by transferring the knowledge from high-level features of a complex network (teacher) to a lightweight network (student) at both training and inference times.
The proposed AttTrack framework has three key components: 1) cross-model feature learning to align intermediate representations from the teacher and student models, 2) interleaving the execution of the two models at inference time, and 3) incorporating the updated predictions from the teacher model as prior knowledge to assist the student model
arXiv Detail & Related papers (2022-10-16T22:15:31Z) - Unsupervised Learning of Accurate Siamese Tracking [68.58171095173056]
We present a novel unsupervised tracking framework, in which we can learn temporal correspondence both on the classification branch and regression branch.
Our tracker outperforms preceding unsupervised methods by a substantial margin, performing on par with supervised methods on large-scale datasets such as TrackingNet and LaSOT.
arXiv Detail & Related papers (2022-04-04T13:39:43Z) - Unsupervised Deep Representation Learning for Real-Time Tracking [137.69689503237893]
We propose an unsupervised learning method for visual tracking.
The motivation of our unsupervised learning is that a robust tracker should be effective in bidirectional tracking.
We build our framework on a Siamese correlation filter network, and propose a multi-frame validation scheme and a cost-sensitive loss to facilitate unsupervised learning.
arXiv Detail & Related papers (2020-07-22T08:23:12Z) - Tracking-by-Trackers with a Distilled and Reinforced Model [24.210580784051277]
A compact student model is trained via the marriage of knowledge distillation and reinforcement learning.
The proposed algorithms compete with real-time state-of-the-art trackers.
arXiv Detail & Related papers (2020-07-08T13:24:04Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.