Target Transformed Regression for Accurate Tracking
- URL: http://arxiv.org/abs/2104.00403v1
- Date: Thu, 1 Apr 2021 11:25:23 GMT
- Title: Target Transformed Regression for Accurate Tracking
- Authors: Yutao Cui, Cheng Jiang, Limin Wang and Gangshan Wu
- Abstract summary: This paper repurposes a Transformer-alike regression branch, termed as Target Transformed Regression (TREG) for accurate anchor-free tracking.
The core to our TREG is to model pair-wise relation between elements in target template and search region, and use the resulted target enhanced visual representation for accurate bounding box regression.
In addition, we devise a simple online template update mechanism to select reliable templates, increasing the robustness for appearance variations and geometric deformations of target in time.
- Score: 30.516462193231888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate tracking is still a challenging task due to appearance variations,
pose and view changes, and geometric deformations of target in videos. Recent
anchor-free trackers provide an efficient regression mechanism but fail to
produce precise bounding box estimation. To address these issues, this paper
repurposes a Transformer-alike regression branch, termed as Target Transformed
Regression (TREG), for accurate anchor-free tracking. The core to our TREG is
to model pair-wise relation between elements in target template and search
region, and use the resulted target enhanced visual representation for accurate
bounding box regression. This target contextualized representation is able to
enhance the target relevant information to help precisely locate the box
boundaries, and deal with the object deformation to some extent due to its
local and dense matching mechanism. In addition, we devise a simple online
template update mechanism to select reliable templates, increasing the
robustness for appearance variations and geometric deformations of target in
time. Experimental results on visual tracking benchmarks including VOT2018,
VOT2019, OTB100, GOT10k, NFS, UAV123, LaSOT and TrackingNet demonstrate that
TREG obtains the state-of-the-art performance, achieving a success rate of
0.640 on LaSOT, while running at around 30 FPS. The code and models will be
made available at https://github.com/MCG-NJU/TREG.
Related papers
- Stanceformer: Target-Aware Transformer for Stance Detection [59.69858080492586]
Stance Detection involves discerning the stance expressed in a text towards a specific subject or target.
Prior works have relied on existing transformer models that lack the capability to prioritize targets effectively.
We introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference.
arXiv Detail & Related papers (2024-10-09T17:24:28Z) - Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection [4.978166837959101]
We introduce a novel robust linear regression estimator, which achieves favorable performance when the error vector follows i.i.d Gaussian-Laplacian distribution.
In addition, we expend IGDTS to a generative tracker, and apply IGDTS-distance to measure the deviation between the sample and the model.
Experimental results on several challenging image sequences show that the proposed tracker outperformance existing trackers.
arXiv Detail & Related papers (2024-06-02T01:51:09Z) - Target-Aware Tracking with Long-term Context Attention [8.20858704675519]
Long-term context attention (LCA) module can perform extensive information fusion on the target and its context from long-term frames.
LCA uses the target state from the previous frame to exclude the interference of similar objects and complex backgrounds.
Our tracker achieves state-of-the-art performance on multiple benchmarks, with 71.1% AUC, 89.3% NP, and 73.0% AO on LaSOT, TrackingNet, and GOT-10k.
arXiv Detail & Related papers (2023-02-27T14:40:58Z) - Revisiting Color-Event based Tracking: A Unified Network, Dataset, and
Metric [53.88188265943762]
We propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously.
Our proposed CEUTrack is simple, effective, and efficient, which achieves over 75 FPS and new SOTA performance.
arXiv Detail & Related papers (2022-11-20T16:01:31Z) - Context-aware Visual Tracking with Joint Meta-updating [11.226947525556813]
We propose a context-aware tracking model to optimize the tracker over the representation space, which jointly meta-update both branches by exploiting information along the whole sequence.
The proposed tracking method achieves an EAO score of 0.514 on VOT2018 with the speed of 40FPS, demonstrating its capability of improving the accuracy and robustness of the underlying tracker with little speed drop.
arXiv Detail & Related papers (2022-04-04T14:16:00Z) - Transforming Model Prediction for Tracking [109.08417327309937]
Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models.
We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets.
Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
arXiv Detail & Related papers (2022-03-21T17:59:40Z) - Generative Target Update for Adaptive Siamese Tracking [7.662745552551165]
Siamese trackers perform similarity matching with templates (i.e., target models) to localize objects within a search region.
Several strategies have been proposed in the literature to update a template based on the tracker output, typically extracted from the target search region in the current frame.
This paper proposes a model adaptation method for Siamese trackers that uses a generative model to produce a synthetic template from the object search regions of several previous frames.
arXiv Detail & Related papers (2022-02-21T00:22:49Z) - Learning Dynamic Compact Memory Embedding for Deformable Visual Object
Tracking [82.34356879078955]
We propose a compact memory embedding to enhance the discrimination of the segmentation-based deformable visual tracking method.
Our method outperforms the excellent segmentation-based trackers, i.e., D3S and SiamMask on DAVIS 2017 benchmark.
arXiv Detail & Related papers (2021-11-23T03:07:12Z) - Transformer Tracking [76.96796612225295]
Correlation acts as a critical role in the tracking field, especially in popular Siamese-based trackers.
This work presents a novel attention-based feature fusion network, which effectively combines the template and search region features solely using attention.
Experiments show that our TransT achieves very promising results on six challenging datasets.
arXiv Detail & Related papers (2021-03-29T09:06:55Z) - Learning Global Structure Consistency for Robust Object Tracking [57.736915865309165]
This work considers the emphtransient variations of the whole scene.
We propose an effective and efficient short-term model that learns to exploit the global structure consistency in a short time.
We empirically verify that the proposed tracker can tackle the two challenging scenarios and validate it on large scale benchmarks.
arXiv Detail & Related papers (2020-08-26T19:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.