Separable Self and Mixed Attention Transformers for Efficient Object
Tracking
- URL: http://arxiv.org/abs/2309.03979v1
- Date: Thu, 7 Sep 2023 19:23:02 GMT
- Title: Separable Self and Mixed Attention Transformers for Efficient Object
Tracking
- Authors: Goutam Yelluru Gopal, Maria A. Amer
- Abstract summary: This paper proposes an efficient self and mixed attention transformer-based architecture for lightweight tracking.
With these contributions, the proposed lightweight tracker deploys a transformer-based backbone and head module concurrently for the first time.
Simulations show that our Separable Self and Mixed Attention-based Tracker, SMAT, surpasses the performance of related lightweight trackers on GOT10k, TrackingNet, LaSOT, NfS30, UAV123, and AVisT datasets.
- Score: 3.9160947065896803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The deployment of transformers for visual object tracking has shown
state-of-the-art results on several benchmarks. However, the transformer-based
models are under-utilized for Siamese lightweight tracking due to the
computational complexity of their attention blocks. This paper proposes an
efficient self and mixed attention transformer-based architecture for
lightweight tracking. The proposed backbone utilizes the separable mixed
attention transformers to fuse the template and search regions during feature
extraction to generate superior feature encoding. Our prediction head performs
global contextual modeling of the encoded features by leveraging efficient
self-attention blocks for robust target state estimation. With these
contributions, the proposed lightweight tracker deploys a transformer-based
backbone and head module concurrently for the first time. Our ablation study
testifies to the effectiveness of the proposed combination of backbone and head
modules. Simulations show that our Separable Self and Mixed Attention-based
Tracker, SMAT, surpasses the performance of related lightweight trackers on
GOT10k, TrackingNet, LaSOT, NfS30, UAV123, and AVisT datasets, while running at
37 fps on CPU, 158 fps on GPU, and having 3.8M parameters. For example, it
significantly surpasses the closely related trackers E.T.Track and
MixFormerV2-S on GOT10k-test by a margin of 7.9% and 5.8%, respectively, in the
AO metric. The tracker code and model is available at
https://github.com/goutamyg/SMAT
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.