Optimized Information Flow for Transformer Tracking
- URL: http://arxiv.org/abs/2402.08195v1
- Date: Tue, 13 Feb 2024 03:39:15 GMT
- Title: Optimized Information Flow for Transformer Tracking
- Authors: Janani Kugarajeevan, Thanikasalam Kokul, Amirthalingam Ramanan, Subha
Fernando
- Abstract summary: One-stream Transformer trackers have shown outstanding performance in challenging benchmark datasets.
We propose a novel OIFTrack framework to enhance the discriminative capability of the tracker.
- Score: 0.7199733380797579
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One-stream Transformer trackers have shown outstanding performance in
challenging benchmark datasets over the last three years, as they enable
interaction between the target template and search region tokens to extract
target-oriented features with mutual guidance. Previous approaches allow free
bidirectional information flow between template and search tokens without
investigating their influence on the tracker's discriminative capability. In
this study, we conducted a detailed study on the information flow of the tokens
and based on the findings, we propose a novel Optimized Information Flow
Tracking (OIFTrack) framework to enhance the discriminative capability of the
tracker. The proposed OIFTrack blocks the interaction from all search tokens to
target template tokens in early encoder layers, as the large number of
non-target tokens in the search region diminishes the importance of
target-specific features. In the deeper encoder layers of the proposed tracker,
search tokens are partitioned into target search tokens and non-target search
tokens, allowing bidirectional flow from target search tokens to template
tokens to capture the appearance changes of the target. In addition, since the
proposed tracker incorporates dynamic background cues, distractor objects are
successfully avoided by capturing the surrounding information of the target.
The OIFTrack demonstrated outstanding performance in challenging benchmarks,
particularly excelling in the one-shot tracking benchmark GOT-10k, achieving an
average overlap of 74.6\%. The code, models, and results of this work are
available at \url{https://github.com/JananiKugaa/OIFTrack}
Related papers
- Multi-object Tracking by Detection and Query: an efficient end-to-end manner [23.926668750263488]
Multi-object tracking is advancing through two dominant paradigms: traditional tracking by detection and newly emerging tracking by query.
We propose the tracking-by-detection-and-query paradigm, which is achieved by a Learnable Associator.
Compared to tracking-by-query models, LAID achieves competitive tracking accuracy with notably higher training efficiency.
arXiv Detail & Related papers (2024-11-09T14:38:08Z) - RTracker: Recoverable Tracking via PN Tree Structured Memory [71.05904715104411]
We propose a recoverable tracking framework, RTracker, that uses a tree-structured memory to dynamically associate a tracker and a detector to enable self-recovery.
Specifically, we propose a Positive-Negative Tree-structured memory to chronologically store and maintain positive and negative target samples.
Our core idea is to use the support samples of positive and negative target categories to establish a relative distance-based criterion for a reliable assessment of target loss.
arXiv Detail & Related papers (2024-03-28T08:54:40Z) - Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - Target-Aware Tracking with Long-term Context Attention [8.20858704675519]
Long-term context attention (LCA) module can perform extensive information fusion on the target and its context from long-term frames.
LCA uses the target state from the previous frame to exclude the interference of similar objects and complex backgrounds.
Our tracker achieves state-of-the-art performance on multiple benchmarks, with 71.1% AUC, 89.3% NP, and 73.0% AO on LaSOT, TrackingNet, and GOT-10k.
arXiv Detail & Related papers (2023-02-27T14:40:58Z) - Joint Feature Learning and Relation Modeling for Tracking: A One-Stream
Framework [76.70603443624012]
We propose a novel one-stream tracking (OSTrack) framework that unifies feature learning and relation modeling.
In this way, discriminative target-oriented features can be dynamically extracted by mutual guidance.
OSTrack achieves state-of-the-art performance on multiple benchmarks, in particular, it shows impressive results on the one-shot tracking benchmark GOT-10k.
arXiv Detail & Related papers (2022-03-22T18:37:11Z) - Learning Dynamic Compact Memory Embedding for Deformable Visual Object
Tracking [82.34356879078955]
We propose a compact memory embedding to enhance the discrimination of the segmentation-based deformable visual tracking method.
Our method outperforms the excellent segmentation-based trackers, i.e., D3S and SiamMask on DAVIS 2017 benchmark.
arXiv Detail & Related papers (2021-11-23T03:07:12Z) - Coarse-to-Fine Object Tracking Using Deep Features and Correlation
Filters [2.3526458707956643]
This paper presents a novel deep learning tracking algorithm.
We exploit the generalization ability of deep features to coarsely estimate target translation.
Then, we capitalize on the discriminative power of correlation filters to precisely localize the tracked object.
arXiv Detail & Related papers (2020-12-23T16:43:21Z) - Graph Attention Tracking [76.19829750144564]
We propose a simple target-aware Siamese graph attention network for general object tracking.
Experiments on challenging benchmarks including GOT-10k, UAV123, OTB-100 and LaSOT demonstrate that the proposed SiamGAT outperforms many state-of-the-art trackers.
arXiv Detail & Related papers (2020-11-23T04:26:45Z) - Tracking-by-Counting: Using Network Flows on Crowd Density Maps for
Tracking Multiple Targets [96.98888948518815]
State-of-the-art multi-object tracking(MOT) methods follow the tracking-by-detection paradigm.
We propose a new MOT paradigm, tracking-by-counting, tailored for crowded scenes.
arXiv Detail & Related papers (2020-07-18T19:51:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.