OmniTracker: Unifying Object Tracking by Tracking-with-Detection
- URL: http://arxiv.org/abs/2303.12079v1
- Date: Tue, 21 Mar 2023 17:59:57 GMT
- Title: OmniTracker: Unifying Object Tracking by Tracking-with-Detection
- Authors: Junke Wang and Dongdong Chen and Zuxuan Wu and Chong Luo and Xiyang
Dai and Lu Yuan and Yu-Gang Jiang
- Abstract summary: OmniTracker is presented to resolve all the tracking tasks with a fully shared network architecture, model weights, and inference pipeline.
Experiments on 7 tracking datasets, including LaSOT, TrackingNet, DAVIS16-17, MOT17, MOTS20, and YTVIS19, demonstrate that OmniTracker achieves on-par or even better results than both task-specific and unified tracking models.
- Score: 119.51012668709502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object tracking (OT) aims to estimate the positions of target objects in a
video sequence. Depending on whether the initial states of target objects are
specified by provided annotations in the first frame or the categories, OT
could be classified as instance tracking (e.g., SOT and VOS) and category
tracking (e.g., MOT, MOTS, and VIS) tasks. Combing the advantages of the best
practices developed in both communities, we propose a novel
tracking-with-detection paradigm, where tracking supplements appearance priors
for detection and detection provides tracking with candidate bounding boxes for
association. Equipped with such a design, a unified tracking model,
OmniTracker, is further presented to resolve all the tracking tasks with a
fully shared network architecture, model weights, and inference pipeline.
Extensive experiments on 7 tracking datasets, including LaSOT, TrackingNet,
DAVIS16-17, MOT17, MOTS20, and YTVIS19, demonstrate that OmniTracker achieves
on-par or even better results than both task-specific and unified tracking
models.
Related papers
- Track Anything Rapter(TAR) [0.0]
Track Anything Rapter (TAR) is designed to detect, segment, and track objects of interest based on user-provided multimodal queries.
TAR utilizes cutting-edge pre-trained models like DINO, CLIP, and SAM to estimate the relative pose of the queried object.
We showcase how the integration of these foundational models with a custom high-level control algorithm results in a highly stable and precise tracking system.
arXiv Detail & Related papers (2024-05-19T19:51:41Z) - Tracking with Human-Intent Reasoning [64.69229729784008]
This work proposes a new tracking task -- Instruction Tracking.
It involves providing implicit tracking instructions that require the trackers to perform tracking automatically in video frames.
TrackGPT is capable of performing complex reasoning-based tracking.
arXiv Detail & Related papers (2023-12-29T03:22:18Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Unified Transformer Tracker for Object Tracking [58.65901124158068]
We present the Unified Transformer Tracker (UTT) to address tracking problems in different scenarios with one paradigm.
A track transformer is developed in our UTT to track the target in both Single Object Tracking (SOT) and Multiple Object Tracking (MOT)
arXiv Detail & Related papers (2022-03-29T01:38:49Z) - Track to Detect and Segment: An Online Multi-Object Tracker [81.15608245513208]
TraDeS is an online joint detection and tracking model, exploiting tracking clues to assist detection end-to-end.
TraDeS infers object tracking offset by a cost volume, which is used to propagate previous object features.
arXiv Detail & Related papers (2021-03-16T02:34:06Z) - Chained-Tracker: Chaining Paired Attentive Regression Results for
End-to-End Joint Multiple-Object Detection and Tracking [102.31092931373232]
We propose a simple online model named Chained-Tracker (CTracker), which naturally integrates all the three subtasks into an end-to-end solution.
The two major novelties: chained structure and paired attentive regression, make CTracker simple, fast and effective.
arXiv Detail & Related papers (2020-07-29T02:38:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.