SiamTHN: Siamese Target Highlight Network for Visual Tracking
- URL: http://arxiv.org/abs/2303.12304v1
- Date: Wed, 22 Mar 2023 04:33:02 GMT
- Title: SiamTHN: Siamese Target Highlight Network for Visual Tracking
- Authors: Jiahao Bao, Kaiqiang Chen, Xian Sun, Liangjin Zhao, Wenhui Diao,
Menglong Yan
- Abstract summary: Siamese network based trackers treat each channel in the feature maps generated by the backbone network equally.
No structural links between the classification and regression branches in these trackers, and the two branches are optimized separately during training.
A Target Highlight Module is proposed to help the generated similarity response maps to be more focused on the target region.
- Score: 11.111738354621595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Siamese network based trackers develop rapidly in the field of visual object
tracking in recent years. The majority of siamese network based trackers now in
use treat each channel in the feature maps generated by the backbone network
equally, making the similarity response map sensitive to background influence
and hence challenging to focus on the target region. Additionally, there are no
structural links between the classification and regression branches in these
trackers, and the two branches are optimized separately during training.
Therefore, there is a misalignment between the classification and regression
branches, which results in less accurate tracking results. In this paper, a
Target Highlight Module is proposed to help the generated similarity response
maps to be more focused on the target region. To reduce the misalignment and
produce more precise tracking results, we propose a corrective loss to train
the model. The two branches of the model are jointly tuned with the use of
corrective loss to produce more reliable prediction results. Experiments on 5
challenging benchmark datasets reveal that the method outperforms current
models in terms of performance, and runs at 38 fps, proving its effectiveness
and efficiency.
Related papers
- Multi-attention Associate Prediction Network for Visual Tracking [3.9628431811908533]
classification-regression prediction networks have realized impressive success in several modern deep trackers.
There is an inherent difference between classification and regression tasks, so they have diverse even opposite demands for feature matching.
We propose a multi-attention associate prediction network (MAPNet) to tackle the above problems.
arXiv Detail & Related papers (2024-03-25T03:18:58Z) - Joint Feature Learning and Relation Modeling for Tracking: A One-Stream
Framework [76.70603443624012]
We propose a novel one-stream tracking (OSTrack) framework that unifies feature learning and relation modeling.
In this way, discriminative target-oriented features can be dynamically extracted by mutual guidance.
OSTrack achieves state-of-the-art performance on multiple benchmarks, in particular, it shows impressive results on the one-shot tracking benchmark GOT-10k.
arXiv Detail & Related papers (2022-03-22T18:37:11Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - TSG: Target-Selective Gradient Backprop for Probing CNN Visual Saliency [72.9106103283475]
We study the visual saliency, a.k.a. visual explanation, to interpret convolutional neural networks.
Inspired by those observations, we propose a novel visual saliency framework, termed Target-Selective Gradient (TSG) backprop.
The proposed TSG consists of two components, namely, TSG-Conv and TSG-FC, which rectify the gradients for convolutional layers and fully-connected layers, respectively.
arXiv Detail & Related papers (2021-10-11T12:00:20Z) - MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking [72.65494220685525]
We propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data.
We generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism.
arXiv Detail & Related papers (2021-07-22T03:10:51Z) - SiamRCR: Reciprocal Classification and Regression for Visual Object
Tracking [47.647615772027606]
We propose a novel siamese tracking algorithm called SiamRCR, addressing this problem with a simple, light and effective solution.
It builds reciprocal links between classification and regression branches, which can dynamically re-weight their losses for each positive sample.
In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference.
arXiv Detail & Related papers (2021-05-24T12:21:25Z) - DCF-ASN: Coarse-to-fine Real-time Visual Tracking via Discriminative
Correlation Filter and Attentional Siamese Network [9.01402976480327]
Discriminative correlation filters (DCF) and siamese networks have achieved promising performance on visual tracking tasks.
We propose a coarse-to-fine tracking framework, which roughly infers the target state via an online-updating DCF module.
The proposed DCF-ASN achieves the state-of-the-art performance while exhibiting good tracking efficiency.
arXiv Detail & Related papers (2021-03-19T03:01:21Z) - Multiple Convolutional Features in Siamese Networks for Object Tracking [13.850110645060116]
Multiple Features-Siamese Tracker (MFST) is a novel tracking algorithm exploiting several hierarchical feature maps for robust tracking.
MFST achieves high tracking accuracy, while outperforming the standard siamese tracker on object tracking benchmarks.
arXiv Detail & Related papers (2021-03-01T08:02:27Z) - Graph Attention Tracking [76.19829750144564]
We propose a simple target-aware Siamese graph attention network for general object tracking.
Experiments on challenging benchmarks including GOT-10k, UAV123, OTB-100 and LaSOT demonstrate that the proposed SiamGAT outperforms many state-of-the-art trackers.
arXiv Detail & Related papers (2020-11-23T04:26:45Z) - DiResNet: Direction-aware Residual Network for Road Extraction in VHR
Remote Sensing Images [12.081877372552606]
We present a direction-aware residual network (DiResNet) that includes three main contributions.
The proposed method has advantages in both overall accuracy and F1-score.
arXiv Detail & Related papers (2020-05-14T19:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.