Mobile Vision Transformer-based Visual Object Tracking
- URL: http://arxiv.org/abs/2309.05829v1
- Date: Mon, 11 Sep 2023 21:16:41 GMT
- Title: Mobile Vision Transformer-based Visual Object Tracking
- Authors: Goutam Yelluru Gopal, Maria A. Amer
- Abstract summary: We propose a lightweight, accurate, and fast tracking algorithm using MobileViT as the backbone for the first time.
Our method outperforms the popular DiMP-50 tracker despite having 4.7 times fewer model parameters and running at 2.8 times its speed on a GPU.
- Score: 3.9160947065896803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The introduction of robust backbones, such as Vision Transformers, has
improved the performance of object tracking algorithms in recent years.
However, these state-of-the-art trackers are computationally expensive since
they have a large number of model parameters and rely on specialized hardware
(e.g., GPU) for faster inference. On the other hand, recent lightweight
trackers are fast but are less accurate, especially on large-scale datasets. We
propose a lightweight, accurate, and fast tracking algorithm using Mobile
Vision Transformers (MobileViT) as the backbone for the first time. We also
present a novel approach of fusing the template and search region
representations in the MobileViT backbone, thereby generating superior feature
encoding for target localization. The experimental results show that our
MobileViT-based Tracker, MVT, surpasses the performance of recent lightweight
trackers on the large-scale datasets GOT10k and TrackingNet, and with a high
inference speed. In addition, our method outperforms the popular DiMP-50
tracker despite having 4.7 times fewer model parameters and running at 2.8
times its speed on a GPU. The tracker code and models are available at
https://github.com/goutamyg/MVT
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.