CFTrack: Enhancing Lightweight Visual Tracking through Contrastive Learning and Feature Matching
- URL: http://arxiv.org/abs/2502.19705v1
- Date: Thu, 27 Feb 2025 02:46:00 GMT
- Title: CFTrack: Enhancing Lightweight Visual Tracking through Contrastive Learning and Feature Matching
- Authors: Juntao Liang, Jun Hou, Weijun Zhang, Yong Wang,
- Abstract summary: CFTrack is a lightweight tracker that integrates contrastive learning and feature matching to enhance discriminative feature representations.<n>We show that CFTrack surpasses many state-of-the-art lightweight trackers, operating at 136 frames per second on the NVIDIA Jetson NX platform.
- Score: 7.205438642578179
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Achieving both efficiency and strong discriminative ability in lightweight visual tracking is a challenge, especially on mobile and edge devices with limited computational resources. Conventional lightweight trackers often struggle with robustness under occlusion and interference, while deep trackers, when compressed to meet resource constraints, suffer from performance degradation. To address these issues, we introduce CFTrack, a lightweight tracker that integrates contrastive learning and feature matching to enhance discriminative feature representations. CFTrack dynamically assesses target similarity during prediction through a novel contrastive feature matching module optimized with an adaptive contrastive loss, thereby improving tracking accuracy. Extensive experiments on LaSOT, OTB100, and UAV123 show that CFTrack surpasses many state-of-the-art lightweight trackers, operating at 136 frames per second on the NVIDIA Jetson NX platform. Results on the HOOT dataset further demonstrate CFTrack's strong discriminative ability under heavy occlusion.
Related papers
- Layer-Guided UAV Tracking: Enhancing Efficiency and Occlusion Robustness [12.719243469290346]
LGTrack is a unified UAV tracking framework that integrates dynamic layer selection, efficient feature enhancement, and robust representation learning.<n> Experiments on three datasets demonstrate LGTrack's state-of-the-art real-time speed (258.7 FPS on UAVDT)
arXiv Detail & Related papers (2026-02-14T07:02:25Z) - What You Have is What You Track: Adaptive and Robust Multimodal Tracking [72.92244578461869]
We present the first comprehensive study on tracker performance with temporally incomplete multimodal data.<n>Our model achieves SOTA performance across 9 benchmarks, excelling in both conventional complete and missing modality settings.
arXiv Detail & Related papers (2025-07-08T11:40:21Z) - Efficient Motion Prompt Learning for Robust Visual Tracking [58.59714916705317]
We propose a lightweight and plug-and-play motion prompt tracking method.<n>It can be easily integrated into existing vision-based trackers to build a joint tracking framework.<n>Experiments on seven tracking benchmarks demonstrate that the proposed motion module significantly improves the robustness of vision-based trackers.
arXiv Detail & Related papers (2025-05-22T07:22:58Z) - Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach [32.91982063297922]
We propose a novel Slow-Fast Tracking paradigm that flexibly adapts to different operational requirements, termed SFTrack.<n>The proposed framework supports two complementary modes, i.e., a high-precision slow tracker for scenarios with sufficient computational resources, and an efficient fast tracker tailored for latency-aware, resource-constrained environments.<n>Our framework first performs graph-based representation learning from high-temporal-resolution event streams, and then integrates the learned graph-structured information into two FlashAttention-based vision backbones.
arXiv Detail & Related papers (2025-05-19T09:37:23Z) - DARTer: Dynamic Adaptive Representation Tracker for Nighttime UAV Tracking [1.515687944002438]
Nighttime UAV tracking presents significant challenges due to extreme illumination variations and viewpoint changes.<n>textbfDARTer (textbfDynamic textbfAdaptive textbfRepresentation textbfTracker) is an end-to-end tracking framework designed for nighttime UAV scenarios.
arXiv Detail & Related papers (2025-05-01T05:24:14Z) - LiteTracker: Leveraging Temporal Causality for Accurate Low-latency Tissue Tracking [84.52765560227917]
LiteTracker is a low-latency method for tissue tracking in endoscopic video streams.
LiteTracker builds on a state-of-the-art long-term point tracking method, and introduces a set of training-free runtime optimizations.
arXiv Detail & Related papers (2025-04-14T05:53:57Z) - Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.
DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.
Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - Learning Disentangled Representation with Mutual Information
Maximization for Real-Time UAV Tracking [1.0541541376305243]
This paper exploits disentangled representation with mutual information (DR-MIM) to improve precision and efficiency for UAV tracking.
Our DR-MIM tracker significantly outperforms state-of-the-art UAV tracking methods.
arXiv Detail & Related papers (2023-08-20T13:16:15Z) - SCTracker: Multi-object tracking with shape and confidence constraints [11.210661553388615]
This paper proposes a multi-object tracker based on shape constraint and confidence named SCTracker.
Intersection of Union distance with shape constraints is applied to calculate the cost matrix between tracks and detections.
The Kalman Filter based on the detection confidence is used to update the motion state to improve the tracking performance when the detection has low confidence.
arXiv Detail & Related papers (2023-05-16T15:18:42Z) - DEFT: Detection Embeddings for Tracking [3.326320568999945]
We propose an efficient joint detection and tracking model named DEFT.
Our approach relies on an appearance-based object matching network jointly-learned with an underlying object detection network.
DEFT has comparable accuracy and speed to the top methods on 2D online tracking leaderboards.
arXiv Detail & Related papers (2021-02-03T20:00:44Z) - Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative.
DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances.
Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z) - Cascaded Regression Tracking: Towards Online Hard Distractor
Discrimination [202.2562153608092]
We propose a cascaded regression tracker with two sequential stages.
In the first stage, we filter out abundant easily-identified negative candidates.
In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples.
arXiv Detail & Related papers (2020-06-18T07:48:01Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z) - Rethinking Convolutional Features in Correlation Filter Based Tracking [0.0]
We revisit a hierarchical deep feature-based visual tracker and find that both the performance and efficiency of the deep tracker are limited by the poor feature quality.
After removing redundant features, our proposed tracker achieves significant improvements in both performance and efficiency.
arXiv Detail & Related papers (2019-12-30T04:39:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.