LiteTrack: Layer Pruning with Asynchronous Feature Extraction for
Lightweight and Efficient Visual Tracking
- URL: http://arxiv.org/abs/2309.09249v1
- Date: Sun, 17 Sep 2023 12:01:03 GMT
- Title: LiteTrack: Layer Pruning with Asynchronous Feature Extraction for
Lightweight and Efficient Visual Tracking
- Authors: Qingmao Wei, Bi Zeng, Jianqi Liu, Li He, Guotian Zeng
- Abstract summary: LiteTrack is an efficient transformer-based tracking model optimized for high-speed operations across various devices.
It achieves a more favorable trade-off between accuracy and efficiency than the other lightweight trackers.
LiteTrack-B9 reaches competitive 72.2% AO on GOT-10k and 82.4% AUC on TrackingNet, and operates at 171 fps on an NVIDIA 2080Ti GPU.
- Score: 4.179339279095506
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent advancements in transformer-based visual trackers have led to
significant progress, attributed to their strong modeling capabilities.
However, as performance improves, running latency correspondingly increases,
presenting a challenge for real-time robotics applications, especially on edge
devices with computational constraints. In response to this, we introduce
LiteTrack, an efficient transformer-based tracking model optimized for
high-speed operations across various devices. It achieves a more favorable
trade-off between accuracy and efficiency than the other lightweight trackers.
The main innovations of LiteTrack encompass: 1) asynchronous feature extraction
and interaction between the template and search region for better feature
fushion and cutting redundant computation, and 2) pruning encoder layers from a
heavy tracker to refine the balnace between performance and speed. As an
example, our fastest variant, LiteTrack-B4, achieves 65.2% AO on the GOT-10k
benchmark, surpassing all preceding efficient trackers, while running over 100
fps with ONNX on the Jetson Orin NX edge device. Moreover, our LiteTrack-B9
reaches competitive 72.2% AO on GOT-10k and 82.4% AUC on TrackingNet, and
operates at 171 fps on an NVIDIA 2080Ti GPU. The code and demo materials will
be available at https://github.com/TsingWei/LiteTrack.
Related papers
- Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.
DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.
Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - Mobile Vision Transformer-based Visual Object Tracking [3.9160947065896803]
We propose a lightweight, accurate, and fast tracking algorithm using MobileViT as the backbone for the first time.
Our method outperforms the popular DiMP-50 tracker despite having 4.7 times fewer model parameters and running at 2.8 times its speed on a GPU.
arXiv Detail & Related papers (2023-09-11T21:16:41Z) - Exploring Lightweight Hierarchical Vision Transformers for Efficient
Visual Tracking [69.89887818921825]
HiT is a new family of efficient tracking models that can run at high speed on different devices.
HiT achieves 64.6% AUC on the LaSOT benchmark, surpassing all previous efficient trackers.
arXiv Detail & Related papers (2023-08-14T02:51:34Z) - Efficient Visual Tracking via Hierarchical Cross-Attention Transformer [82.92565582642847]
We present an efficient tracking method via a hierarchical cross-attention transformer named HCAT.
Our model runs about 195 fps on GPU, 45 fps on CPU, and 55 fps on the edge AI platform of NVidia Jetson AGX Xavier.
arXiv Detail & Related papers (2022-03-25T09:45:27Z) - Efficient Visual Tracking with Exemplar Transformers [98.62550635320514]
We introduce the Exemplar Transformer, an efficient transformer for real-time visual object tracking.
E.T.Track, our visual tracker that incorporates Exemplar Transformer layers, runs at 47 fps on a CPU.
This is up to 8 times faster than other transformer-based models.
arXiv Detail & Related papers (2021-12-17T18:57:54Z) - SwinTrack: A Simple and Strong Baseline for Transformer Tracking [81.65306568735335]
We propose a fully attentional-based Transformer tracking algorithm, Swin-Transformer Tracker (SwinTrack)
SwinTrack uses Transformer for both feature extraction and feature fusion, allowing full interactions between the target object and the search region for tracking.
In our thorough experiments, SwinTrack sets a new record with 0.717 SUC on LaSOT, surpassing STARK by 4.6% while still running at 45 FPS.
arXiv Detail & Related papers (2021-12-02T05:56:03Z) - LightTrack: Finding Lightweight Neural Networks for Object Tracking via
One-Shot Architecture Search [104.84999119090887]
We present LightTrack, which uses neural architecture search (NAS) to design more lightweight and efficient object trackers.
Comprehensive experiments show that our LightTrack is effective.
It can find trackers that achieve superior performance compared to handcrafted SOTA trackers, such as SiamRPN++ and Ocean.
arXiv Detail & Related papers (2021-04-29T17:55:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.