Lightweight Full-Convolutional Siamese Tracker
- URL: http://arxiv.org/abs/2310.05392v3
- Date: Fri, 12 Jan 2024 12:34:04 GMT
- Title: Lightweight Full-Convolutional Siamese Tracker
- Authors: Yunfeng Li, Bo Wang, Xueyi Wu, Zhuoyan Liu, Ye Li
- Abstract summary: This paper proposes a lightweight full-convolutional Siamese tracker called LightFC.
LightFC employs a novel efficient cross-correlation module and a novel efficient rep-center head.
Experiments show that LightFC achieves the optimal balance between performance, parameters, Flops and FPS.
- Score: 4.903759699116597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although single object trackers have achieved advanced performance, their
large-scale models hinder their application on limited resources platforms.
Moreover, existing lightweight trackers only achieve a balance between 2-3
points in terms of parameters, performance, Flops and FPS. To achieve the
optimal balance among these points, this paper proposes a lightweight
full-convolutional Siamese tracker called LightFC. LightFC employs a novel
efficient cross-correlation module (ECM) and a novel efficient rep-center head
(ERH) to improve the feature representation of the convolutional tracking
pipeline. The ECM uses an attention-like module design, which conducts spatial
and channel linear fusion of fused features and enhances the nonlinearity of
the fused features. Additionally, it refers to successful factors of current
lightweight trackers and introduces skip-connections and reuse of search area
features. The ERH reparameterizes the feature dimensional stage in the standard
center-head and introduces channel attention to optimize the bottleneck of key
feature flows. Comprehensive experiments show that LightFC achieves the optimal
balance between performance, parameters, Flops and FPS. The precision score of
LightFC outperforms MixFormerV2-S on LaSOT and TNL2K by 3.7 % and 6.5 %,
respectively, while using 5x fewer parameters and 4.6x fewer Flops. Besides,
LightFC runs 2x faster than MixFormerV2-S on CPUs. In addition, a
higher-performance version named LightFC-vit is proposed by replacing a more
powerful backbone network. The code and raw results can be found at
https://github.com/LiYunfengLYF/LightFC.
Related papers
- RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization [8.346566205092433]
lightweight Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) are favored for their parameter efficiency and low latency.
This study investigates the complementary advantages of CNNs and ViTs to develop a versatile vision backbone tailored for resource-constrained applications.
arXiv Detail & Related papers (2024-06-23T04:11:12Z) - Lightweight Salient Object Detection in Optical Remote-Sensing Images
via Semantic Matching and Edge Alignment [61.45639694373033]
We propose a novel lightweight network for optical remote sensing images (ORSI-SOD) based on semantic matching and edge alignment, termed SeaNet.
Specifically, SeaNet includes a lightweight MobileNet-V2 for feature extraction, a dynamic semantic matching module (DSMM) for high-level features, and a portable decoder for inference.
arXiv Detail & Related papers (2023-01-07T04:33:51Z) - Lightweight and Progressively-Scalable Networks for Semantic
Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation.
In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales.
We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z) - SlimFL: Federated Learning with Superposition Coding over Slimmable
Neural Networks [56.68149211499535]
Federated learning (FL) is a key enabler for efficient communication and computing leveraging devices' distributed computing capabilities.
This paper proposes a novel learning framework by integrating FL and width-adjustable slimmable neural networks (SNNs)
We propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models.
arXiv Detail & Related papers (2022-03-26T15:06:13Z) - FEAR: Fast, Efficient, Accurate and Robust Visual Tracker [2.544539499281093]
We present FEAR, a novel, fast, efficient, accurate, and robust Siamese visual tracker.
FEAR-XS tracker is 2.4x smaller and 4.3x faster than LightTrack [62] with superior accuracy.
arXiv Detail & Related papers (2021-12-15T08:28:55Z) - Neural Calibration for Scalable Beamforming in FDD Massive MIMO with
Implicit Channel Estimation [10.775558382613077]
Channel estimation and beamforming play critical roles in frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems.
We propose a deep learning-based approach that directly optimize the beamformers at the base station according to the received uplink pilots.
A neural calibration method is proposed to improve the scalability of the end-to-end design.
arXiv Detail & Related papers (2021-08-03T14:26:14Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - MicroNet: Towards Image Recognition with Extremely Low FLOPs [117.96848315180407]
MicroNet is an efficient convolutional neural network using extremely low computational cost.
A family of MicroNets achieve a significant performance gain over the state-of-the-art in the low FLOP regime.
For instance, MicroNet-M1 achieves 61.1% top-1 accuracy on ImageNet classification with 12 MFLOPs, outperforming MobileNetV3 by 11.3%.
arXiv Detail & Related papers (2020-11-24T18:59:39Z) - Federated Learning via Intelligent Reflecting Surface [30.935389187215474]
Over-the-air computation algorithm (AirComp) based learning (FL) is capable of achieving fast model aggregation by exploiting the waveform superposition property of multiple access channels.
In this paper, we propose a two-step optimization framework to achieve fast yet reliable model aggregation for AirComp-based FL.
Simulation results will demonstrate that our proposed framework and the deployment of an IRS can achieve a lower training loss and higher FL prediction accuracy than the baseline algorithms.
arXiv Detail & Related papers (2020-11-10T11:29:57Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.