SparseTT: Visual Tracking with Sparse Transformers
- URL: http://arxiv.org/abs/2205.03776v1
- Date: Sun, 8 May 2022 04:00:28 GMT
- Title: SparseTT: Visual Tracking with Sparse Transformers
- Authors: Zhihong Fu, Zehua Fu, Qingjie Liu, Wenrui Cai, Yunhong Wang
- Abstract summary: Self-attention mechanism designed to model long-range dependencies is the key to the success of Transformers.
In this paper, we relieve this issue with a sparse attention mechanism by focusing the most relevant information in the search regions.
We introduce a double-head predictor to boost the accuracy of foreground-background classification and regression of target bounding boxes.
- Score: 43.1666514605021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformers have been successfully applied to the visual tracking task and
significantly promote tracking performance. The self-attention mechanism
designed to model long-range dependencies is the key to the success of
Transformers. However, self-attention lacks focusing on the most relevant
information in the search regions, making it easy to be distracted by
background. In this paper, we relieve this issue with a sparse attention
mechanism by focusing the most relevant information in the search regions,
which enables a much accurate tracking. Furthermore, we introduce a double-head
predictor to boost the accuracy of foreground-background classification and
regression of target bounding boxes, which further improve the tracking
performance. Extensive experiments show that, without bells and whistles, our
method significantly outperforms the state-of-the-art approaches on LaSOT,
GOT-10k, TrackingNet, and UAV123, while running at 40 FPS. Notably, the
training time of our method is reduced by 75% compared to that of TransT. The
source code and models are available at https://github.com/fzh0917/SparseTT.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.