Updatable Siamese Tracker with Two-stage One-shot Learning
- URL: http://arxiv.org/abs/2104.15049v1
- Date: Fri, 30 Apr 2021 15:18:41 GMT
- Title: Updatable Siamese Tracker with Two-stage One-shot Learning
- Authors: Xinglong Sun, Guangliang Han, Lihong Guo, Tingfa Xu, Jianan Li, Peixun
Liu
- Abstract summary: offline Siamese networks have achieved very promising tracking performance, especially in accuracy and efficiency.
Traditional updaters are difficult to process the irregular variations and sampling noises of objects, so it is quite risky to adopt them to update Siamese networks.
In this paper, we first present a two-stage one-shot learner, which can predict the local parameters of primary classifier with object samples from diverse stages.
Then, an updatable Siamese network is proposed based on the learner (SiamTOL), which is able to complement online update by itself.
- Score: 10.13621503834501
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Offline Siamese networks have achieved very promising tracking performance,
especially in accuracy and efficiency. However, they often fail to track an
object in complex scenes due to the incapacity in online update. Traditional
updaters are difficult to process the irregular variations and sampling noises
of objects, so it is quite risky to adopt them to update Siamese networks. In
this paper, we first present a two-stage one-shot learner, which can predict
the local parameters of primary classifier with object samples from diverse
stages. Then, an updatable Siamese network is proposed based on the learner
(SiamTOL), which is able to complement online update by itself. Concretely, we
introduce an extra inputting branch to sequentially capture the latest object
features, and design a residual module to update the initial exemplar using
these features. Besides, an effective multi-aspect training loss is designed
for our network to avoid overfit. Extensive experimental results on several
popular benchmarks including OTB100, VOT2018, VOT2019, LaSOT, UAV123 and GOT10k
manifest that the proposed tracker achieves the leading performance and
outperforms other state-of-the-art methods
Related papers
- Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning [12.5354658533836]
Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples.
For artificial neural network models, determining the most relevant features for distinguishing between two images with limited samples presents a challenge.
We propose an intra-task mutual attention method for few-shot learning, that involves splitting the support and query samples into patches.
arXiv Detail & Related papers (2024-05-06T02:02:57Z) - Correlation-Embedded Transformer Tracking: A Single-Branch Framework [69.0798277313574]
We propose a novel single-branch tracking framework inspired by the transformer.
Unlike the Siamese-like feature extraction, our tracker deeply embeds cross-image feature correlation in multiple layers of the feature network.
The output features can be directly used for predicting target locations without additional correlation steps.
arXiv Detail & Related papers (2024-01-23T13:20:57Z) - Improving Siamese Based Trackers with Light or No Training through Multiple Templates and Temporal Network [0.0]
We propose a framework with two ideas on Siamese-based trackers.
(i) Extending number of templates in a way that removes the need to retrain the network.
(ii) a lightweight temporal network with a novel architecture focusing on both local and global information.
arXiv Detail & Related papers (2022-11-24T22:07:33Z) - Self-Promoted Supervision for Few-Shot Transformer [178.52948452353834]
Self-promoted sUpervisioN (SUN) is a few-shot learning framework for vision transformers (ViTs)
SUN pretrains the ViT on the few-shot learning dataset and then uses it to generate individual location-specific supervision for guiding each patch token.
Experiments show that SUN using ViTs significantly surpasses other few-shot learning frameworks with ViTs and is the first one that achieves higher performance than those CNN state-of-the-arts.
arXiv Detail & Related papers (2022-03-14T12:53:27Z) - Recursive Least-Squares Estimator-Aided Online Learning for Visual
Tracking [58.14267480293575]
We propose a simple yet effective online learning approach for few-shot online adaptation without requiring offline training.
It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before.
We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP.
arXiv Detail & Related papers (2021-12-28T06:51:18Z) - Self-Supervised Learning for Binary Networks by Joint Classifier
Training [11.612308609123566]
We propose a self-supervised learning method for binary networks.
For better training of the binary network, we propose a feature similarity loss, a dynamic balancing scheme of loss terms, and modified multi-stage training.
Our empirical validations show that BSSL outperforms self-supervised learning baselines for binary networks in various downstream tasks and outperforms supervised pretraining in certain tasks.
arXiv Detail & Related papers (2021-10-17T15:38:39Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - Cascaded Regression Tracking: Towards Online Hard Distractor
Discrimination [202.2562153608092]
We propose a cascaded regression tracker with two sequential stages.
In the first stage, we filter out abundant easily-identified negative candidates.
In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples.
arXiv Detail & Related papers (2020-06-18T07:48:01Z) - AFAT: Adaptive Failure-Aware Tracker for Robust Visual Object Tracking [46.82222972389531]
Siamese approaches have achieved promising performance in visual object tracking recently.
Siamese paradigm uses one-shot learning to model the online tracking task, which impedes online adaptation in the tracking process.
We propose a failure-aware system, based on convolutional and LSTM modules in the decision stage, enabling online reporting of potential tracking failures.
arXiv Detail & Related papers (2020-05-27T23:21:12Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.