GUSOT: Green and Unsupervised Single Object Tracking for Long Video
Sequences
- URL: http://arxiv.org/abs/2207.07629v1
- Date: Fri, 15 Jul 2022 17:42:49 GMT
- Title: GUSOT: Green and Unsupervised Single Object Tracking for Long Video
Sequences
- Authors: Zhiruo Zhou, Hongyu Fu, Suya You, C.-C. Jay Kuo
- Abstract summary: A green unsupervised single-object tracker, called GUSOT, aims at object tracking for long videos under a resource-constrained environment.
GUSOT contains two additional new modules: 1) lost object recovery, and 2) color-saliency-based shape proposal.
We conduct experiments on the large-scale dataset LaSOT with long video sequences, and show that GUSOT offers a lightweight high-performance tracking solution.
- Score: 31.600045405499472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised and unsupervised deep trackers that rely on deep learning
technologies are popular in recent years. Yet, they demand high computational
complexity and a high memory cost. A green unsupervised single-object tracker,
called GUSOT, that aims at object tracking for long videos under a
resource-constrained environment is proposed in this work. Built upon a
baseline tracker, UHP-SOT++, which works well for short-term tracking, GUSOT
contains two additional new modules: 1) lost object recovery, and 2)
color-saliency-based shape proposal. They help resolve the tracking loss
problem and offer a more flexible object proposal, respectively. Thus, they
enable GUSOT to achieve higher tracking accuracy in the long run. We conduct
experiments on the large-scale dataset LaSOT with long video sequences, and
show that GUSOT offers a lightweight high-performance tracking solution that
finds applications in mobile and edge computing platforms.
Related papers
- Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.
DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.
Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - Unsupervised Green Object Tracker (GOT) without Offline Pre-training [35.60210259607753]
We propose a new single object tracking method, called the green object tracker (GOT)
GOT offers competitive tracking accuracy with state-of-the-art unsupervised trackers, which demand heavy offline pre-training, at a lower cost.
GOT has a tiny model size (3k parameters) and low inference complexity (around 58M FLOPs per frame)
arXiv Detail & Related papers (2023-09-16T19:00:56Z) - OmniTracker: Unifying Object Tracking by Tracking-with-Detection [119.51012668709502]
OmniTracker is presented to resolve all the tracking tasks with a fully shared network architecture, model weights, and inference pipeline.
Experiments on 7 tracking datasets, including LaSOT, TrackingNet, DAVIS16-17, MOT17, MOTS20, and YTVIS19, demonstrate that OmniTracker achieves on-par or even better results than both task-specific and unified tracking models.
arXiv Detail & Related papers (2023-03-21T17:59:57Z) - Beyond SOT: Tracking Multiple Generic Objects at Once [141.36900362724975]
Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video.
We introduce a new large-scale GOT benchmark, LaGOT, containing multiple annotated target objects per sequence.
Our approach achieves highly competitive results on single-object GOT datasets, setting a new state of the art on TrackingNet with a success rate AUC of 84.4%.
arXiv Detail & Related papers (2022-12-22T17:59:19Z) - SOMPT22: A Surveillance Oriented Multi-Pedestrian Tracking Dataset [5.962184741057505]
We introduce SOMPT22 dataset; a new set for multi person tracking with annotated short videos captured from static cameras located on poles with 6-8 meters in height positioned for city surveillance.
We analyze MOT trackers classified as one-shot and two-stage with respect to the way of use of detection and reID networks on this new dataset.
The experimental results of our new dataset indicate that SOTA is still far from high efficiency, and single-shot trackers are good candidates to unify fast execution and accuracy with competitive performance.
arXiv Detail & Related papers (2022-08-04T11:09:19Z) - Single Object Tracking Research: A Survey [44.24280758718638]
This paper presents the rationale and works of two most popular tracking frameworks in past ten years.
We present some deep learning based tracking methods categorized by different network structures.
We also introduce some classical strategies for handling the challenges in tracking problem.
arXiv Detail & Related papers (2022-04-25T02:59:15Z) - Learning to Track Objects from Unlabeled Videos [63.149201681380305]
In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch.
To narrow the gap between unsupervised trackers and supervised counterparts, we propose an effective unsupervised learning approach composed of three stages.
Experiments show that the proposed USOT learned from unlabeled videos performs well over the state-of-the-art unsupervised trackers by large margins.
arXiv Detail & Related papers (2021-08-28T22:10:06Z) - TDIOT: Target-driven Inference for Deep Video Object Tracking [0.2457872341625575]
In this work, we adopt the pre-trained Mask R-CNN deep object detector as the baseline.
We introduce a novel inference architecture placed on top of FPN-ResNet101 backbone of Mask R-CNN to jointly perform detection and tracking.
The proposed single object tracker, TDIOT, applies an appearance similarity-based temporal matching for data association.
arXiv Detail & Related papers (2021-03-19T20:45:06Z) - Track to Detect and Segment: An Online Multi-Object Tracker [81.15608245513208]
TraDeS is an online joint detection and tracking model, exploiting tracking clues to assist detection end-to-end.
TraDeS infers object tracking offset by a cost volume, which is used to propagate previous object features.
arXiv Detail & Related papers (2021-03-16T02:34:06Z) - DEFT: Detection Embeddings for Tracking [3.326320568999945]
We propose an efficient joint detection and tracking model named DEFT.
Our approach relies on an appearance-based object matching network jointly-learned with an underlying object detection network.
DEFT has comparable accuracy and speed to the top methods on 2D online tracking leaderboards.
arXiv Detail & Related papers (2021-02-03T20:00:44Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.