Tracking by Instance Detection: A Meta-Learning Approach
- URL: http://arxiv.org/abs/2004.00830v1
- Date: Thu, 2 Apr 2020 05:55:06 GMT
- Title: Tracking by Instance Detection: A Meta-Learning Approach
- Authors: Guangting Wang, Chong Luo, Xiaoyan Sun, Zhiwei Xiong and Wenjun Zeng
- Abstract summary: We propose a principled three-step approach to build a high-performance tracker.
We build two trackers, named Retina-MAML and FCOS-MAML, based on two modern detectors RetinaNet and FCOS.
Both trackers run in real-time at 40 FPS.
- Score: 99.66119903655711
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the tracking problem as a special type of object detection
problem, which we call instance detection. With proper initialization, a
detector can be quickly converted into a tracker by learning the new instance
from a single image. We find that model-agnostic meta-learning (MAML) offers a
strategy to initialize the detector that satisfies our needs. We propose a
principled three-step approach to build a high-performance tracker. First, pick
any modern object detector trained with gradient descent. Second, conduct
offline training (or initialization) with MAML. Third, perform domain
adaptation using the initial frame. We follow this procedure to build two
trackers, named Retina-MAML and FCOS-MAML, based on two modern detectors
RetinaNet and FCOS. Evaluations on four benchmarks show that both trackers are
competitive against state-of-the-art trackers. On OTB-100, Retina-MAML achieves
the highest ever AUC of 0.712. On TrackingNet, FCOS-MAML ranks the first on the
leader board with an AUC of 0.757 and the normalized precision of 0.822. Both
trackers run in real-time at 40 FPS.
Related papers
- HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision [34.7347336548199]
In camera-based 3D multi-object tracking (MOT), the prevailing methods follow the tracking-by-query-propagation paradigm.
We present HSTrack, a novel plug-and-play method designed to co-facilitate multi-task learning for detection and tracking.
arXiv Detail & Related papers (2024-11-11T08:18:49Z) - Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object
Tracking [27.74953961900086]
Existing end-to-end Multi-Object Tracking (e2e-MOT) methods have not surpassed non-end-to-end tracking-by-detection methods.
We present Co-MOT, a simple and effective method to facilitate e2e-MOT by a novel coopetition label assignment with a shadow concept.
arXiv Detail & Related papers (2023-05-22T05:18:34Z) - MixFormer: End-to-End Tracking with Iterative Mixed Attention [47.78513247048846]
We present a compact tracking framework, termed as MixFormer, built upon transformers.
We propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.
Our MixFormer trackers set a new state-of-the-art performance on seven tracking benchmarks.
arXiv Detail & Related papers (2023-02-06T14:38:09Z) - StrongSORT: Make DeepSORT Great Again [19.099510933467148]
We revisit the classic tracker DeepSORT and upgrade it from various aspects, i.e., detection, embedding and association.
The resulting tracker, called StrongSORT, sets new HOTA and IDF1 records on MOT17 and MOT20.
We present two lightweight and plug-and-play algorithms to further refine the tracking results.
arXiv Detail & Related papers (2022-02-28T02:37:19Z) - ByteTrack: Multi-Object Tracking by Associating Every Detection Box [51.93588012109943]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
Most methods obtain identities by associating detection boxes whose scores are higher than a threshold.
We present a simple, effective and generic association method, called BYTE, tracking BY associaTing every detection box instead of only the high score ones.
arXiv Detail & Related papers (2021-10-13T17:01:26Z) - Learning to Track Objects from Unlabeled Videos [63.149201681380305]
In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch.
To narrow the gap between unsupervised trackers and supervised counterparts, we propose an effective unsupervised learning approach composed of three stages.
Experiments show that the proposed USOT learned from unlabeled videos performs well over the state-of-the-art unsupervised trackers by large margins.
arXiv Detail & Related papers (2021-08-28T22:10:06Z) - Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT
Philosophy [63.91005999481061]
A practical long-term tracker typically contains three key properties, i.e. an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism.
We propose a two-task tracking frame work (named DMTrack) to achieve distractor-aware fast tracking via Dynamic convolutions (d-convs) and Multiple object tracking (MOT) philosophy.
Our tracker achieves state-of-the-art performance on the LaSOT, OxUvA, TLP, VOT2018LT and VOT 2019LT benchmarks and runs in real-time (3x faster
arXiv Detail & Related papers (2021-04-25T00:59:53Z) - Chained-Tracker: Chaining Paired Attentive Regression Results for
End-to-End Joint Multiple-Object Detection and Tracking [102.31092931373232]
We propose a simple online model named Chained-Tracker (CTracker), which naturally integrates all the three subtasks into an end-to-end solution.
The two major novelties: chained structure and paired attentive regression, make CTracker simple, fast and effective.
arXiv Detail & Related papers (2020-07-29T02:38:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.