BACTrack: Building Appearance Collection for Aerial Tracking
- URL: http://arxiv.org/abs/2312.06136v1
- Date: Mon, 11 Dec 2023 05:55:59 GMT
- Title: BACTrack: Building Appearance Collection for Aerial Tracking
- Authors: Xincong Liu, Tingfa Xu, Ying Wang, Zhinong Yu, Xiaoying Yuan, Haolin
Qin, and Jianan Li
- Abstract summary: Building Appearance Collection Tracking builds a dynamic collection of target templates online and performs efficient multi-template matching to achieve robust tracking.
BACTrack achieves top performance on four challenging aerial tracking benchmarks while maintaining an impressive speed of over 87 FPS on a single GPU.
- Score: 13.785254511683966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Siamese network-based trackers have shown remarkable success in aerial
tracking. Most previous works, however, usually perform template matching only
between the initial template and the search region and thus fail to deal with
rapidly changing targets that often appear in aerial tracking. As a remedy,
this work presents Building Appearance Collection Tracking (BACTrack). This
simple yet effective tracking framework builds a dynamic collection of target
templates online and performs efficient multi-template matching to achieve
robust tracking. Specifically, BACTrack mainly comprises a Mixed-Temporal
Transformer (MTT) and an appearance discriminator. The former is responsible
for efficiently building relationships between the search region and multiple
target templates in parallel through a mixed-temporal attention mechanism. At
the same time, the appearance discriminator employs an online adaptive
template-update strategy to ensure that the collected multiple templates remain
reliable and diverse, allowing them to closely follow rapid changes in the
target's appearance and suppress background interference during tracking.
Extensive experiments show that our BACTrack achieves top performance on four
challenging aerial tracking benchmarks while maintaining an impressive speed of
over 87 FPS on a single GPU. Speed tests on embedded platforms also validate
our potential suitability for deployment on UAV platforms.
Related papers
- STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - Track Anything Rapter(TAR) [0.0]
Track Anything Rapter (TAR) is designed to detect, segment, and track objects of interest based on user-provided multimodal queries.
TAR utilizes cutting-edge pre-trained models like DINO, CLIP, and SAM to estimate the relative pose of the queried object.
We showcase how the integration of these foundational models with a custom high-level control algorithm results in a highly stable and precise tracking system.
arXiv Detail & Related papers (2024-05-19T19:51:41Z) - Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers [55.46413719810273]
rich-temporal information is crucial to the complicated target appearance in visual tracking.
Our method improves the tracker's performance on six popular tracking benchmarks.
arXiv Detail & Related papers (2024-03-15T02:39:26Z) - Multi-step Temporal Modeling for UAV Tracking [14.687636301587045]
We introduce MT-Track, a streamlined and efficient multi-step temporal modeling framework for enhanced UAV tracking.
We unveil a unique temporal correlation module that dynamically assesses the interplay between the template and search region features.
We propose a mutual transformer module to refine the correlation maps of historical and current frames by modeling the temporal knowledge in the tracking sequence.
arXiv Detail & Related papers (2024-03-07T09:48:13Z) - ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking [0.5371337604556311]
Efficiently modeling-temporal relations of objects is a key challenge in visual object tracking (VOT)
Existing methods track by appearance-based similarity or long-term relation modeling, resulting in rich temporal contexts between consecutive frames being easily overlooked.
In this paper we present ACTrack, a new framework with additive pre-temporal tracking framework with large memory conditions. It preserves the quality and capabilities of the pre-trained backbone by freezing its parameters, and makes a trainable lightweight additive net to model temporal relations in tracking.
We design an additive siamese convolutional network to ensure the integrity of spatial features and temporal sequence
arXiv Detail & Related papers (2024-02-27T07:34:08Z) - Correlation-Embedded Transformer Tracking: A Single-Branch Framework [69.0798277313574]
We propose a novel single-branch tracking framework inspired by the transformer.
Unlike the Siamese-like feature extraction, our tracker deeply embeds cross-image feature correlation in multiple layers of the feature network.
The output features can be directly used for predicting target locations without additional correlation steps.
arXiv Detail & Related papers (2024-01-23T13:20:57Z) - Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - Unified Transformer Tracker for Object Tracking [58.65901124158068]
We present the Unified Transformer Tracker (UTT) to address tracking problems in different scenarios with one paradigm.
A track transformer is developed in our UTT to track the target in both Single Object Tracking (SOT) and Multiple Object Tracking (MOT)
arXiv Detail & Related papers (2022-03-29T01:38:49Z) - Joint Feature Learning and Relation Modeling for Tracking: A One-Stream
Framework [76.70603443624012]
We propose a novel one-stream tracking (OSTrack) framework that unifies feature learning and relation modeling.
In this way, discriminative target-oriented features can be dynamically extracted by mutual guidance.
OSTrack achieves state-of-the-art performance on multiple benchmarks, in particular, it shows impressive results on the one-shot tracking benchmark GOT-10k.
arXiv Detail & Related papers (2022-03-22T18:37:11Z) - Learning Dynamic Compact Memory Embedding for Deformable Visual Object
Tracking [82.34356879078955]
We propose a compact memory embedding to enhance the discrimination of the segmentation-based deformable visual tracking method.
Our method outperforms the excellent segmentation-based trackers, i.e., D3S and SiamMask on DAVIS 2017 benchmark.
arXiv Detail & Related papers (2021-11-23T03:07:12Z) - Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted
Multicuts [11.72025865314187]
We present an unsupervised multiple object tracking approach based on minimum visual features and lifted multicuts.
We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking.
arXiv Detail & Related papers (2020-02-04T09:42:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.