Predicting the Best of N Visual Trackers
- URL: http://arxiv.org/abs/2407.15707v1
- Date: Mon, 22 Jul 2024 15:17:09 GMT
- Title: Predicting the Best of N Visual Trackers
- Authors: Basit Alawode, Sajid Javed, Arif Mahmood, Jiri Matas,
- Abstract summary: No single tracker remains the best performer across all tracking attributes and datasets.
To bridge this gap, we predict the "Best of the N Trackers", called the BofN meta-tracker.
We also introduce a frame-level BofN meta-tracker which keeps predicting best performer after regular temporal intervals.
- Score: 34.93745058337489
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We observe that the performance of SOTA visual trackers surprisingly strongly varies across different video attributes and datasets. No single tracker remains the best performer across all tracking attributes and datasets. To bridge this gap, for a given video sequence, we predict the "Best of the N Trackers", called the BofN meta-tracker. At its core, a Tracking Performance Prediction Network (TP2N) selects a predicted best performing visual tracker for the given video sequence using only a few initial frames. We also introduce a frame-level BofN meta-tracker which keeps predicting best performer after regular temporal intervals. The TP2N is based on self-supervised learning architectures MocoV2, SwAv, BT, and DINO; experiments show that the DINO with ViT-S as a backbone performs the best. The video-level BofN meta-tracker outperforms, by a large margin, existing SOTA trackers on nine standard benchmarks - LaSOT, TrackingNet, GOT-10K, VOT2019, VOT2021, VOT2022, UAV123, OTB100, and WebUAV-3M. Further improvement is achieved by the frame-level BofN meta-tracker effectively handling variations in the tracking scenarios within long sequences. For instance, on GOT-10k, BofN meta-tracker average overlap is 88.7% and 91.1% with video and frame-level settings respectively. The best performing tracker, RTS, achieves 85.20% AO. On VOT2022, BofN expected average overlap is 67.88% and 70.98% with video and frame level settings, compared to the best performing ARTrack, 64.12%. This work also presents an extensive evaluation of competitive tracking methods on all commonly used benchmarks, following their protocols. The code, the trained models, and the results will soon be made publicly available on https://github.com/BasitAlawode/Best_of_N_Trackers.
Related papers
- LiteTrack: Layer Pruning with Asynchronous Feature Extraction for
Lightweight and Efficient Visual Tracking [4.179339279095506]
LiteTrack is an efficient transformer-based tracking model optimized for high-speed operations across various devices.
It achieves a more favorable trade-off between accuracy and efficiency than the other lightweight trackers.
LiteTrack-B9 reaches competitive 72.2% AO on GOT-10k and 82.4% AUC on TrackingNet, and operates at 171 fps on an NVIDIA 2080Ti GPU.
arXiv Detail & Related papers (2023-09-17T12:01:03Z) - CoTracker: It is Better to Track Together [70.63040730154984]
CoTracker is a transformer-based model that tracks a large number of 2D points in long video sequences.
We show that joint tracking significantly improves tracking accuracy and robustness, and allows CoTracker to track occluded points and points outside of the camera view.
arXiv Detail & Related papers (2023-07-14T21:13:04Z) - VariabilityTrack:Multi-Object Tracking with Variable Speed Object
Movement [1.6385815610837167]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
We propose a variable speed Kalman filter algorithm based on environmental feedback and improve the matching process.
arXiv Detail & Related papers (2022-03-12T12:39:41Z) - ByteTrack: Multi-Object Tracking by Associating Every Detection Box [51.93588012109943]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos.
Most methods obtain identities by associating detection boxes whose scores are higher than a threshold.
We present a simple, effective and generic association method, called BYTE, tracking BY associaTing every detection box instead of only the high score ones.
arXiv Detail & Related papers (2021-10-13T17:01:26Z) - STMTrack: Template-free Visual Tracking with Space-time Memory Networks [42.06375415765325]
Existing trackers with template updating mechanisms rely on time-consuming numerical optimization and complex hand-designed strategies to achieve competitive performance.
We propose a novel tracking framework built on top of a space-time memory network that is competent to make full use of historical information related to the target.
Specifically, a novel memory mechanism is introduced, which stores the historical information of the target to guide the tracker to focus on the most informative regions in the current frame.
arXiv Detail & Related papers (2021-04-01T08:10:56Z) - LaSOT: A High-quality Large-scale Single Object Tracking Benchmark [67.96196486540497]
We present LaSOT, a high-quality Large-scale Single Object Tracking benchmark.
LaSOT contains a diverse selection of 85 object classes, and offers 1,550 totaling more than 3.87 million frames.
Each video frame is carefully and manually annotated with a bounding box. This makes LaSOT, to our knowledge, the largest densely annotated tracking benchmark.
arXiv Detail & Related papers (2020-09-08T00:31:56Z) - Tracking Objects as Points [83.9217787335878]
We present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art.
Our tracker, CenterTrack, applies a detection model to a pair of images and detections from the prior frame.
CenterTrack is simple, online (no peeking into the future), and real-time.
arXiv Detail & Related papers (2020-04-02T17:58:40Z) - High-Performance Long-Term Tracking with Meta-Updater [75.80564183653274]
Long-term visual tracking has drawn increasing attention because it is much closer to practical applications than short-term tracking.
Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update.
We propose a novel offline-trained Meta-Updater to address an important but unsolved problem: Is the tracker ready for updating in the current frame?
arXiv Detail & Related papers (2020-04-01T09:29:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.