MFST: Multi-Features Siamese Tracker
- URL: http://arxiv.org/abs/2103.00810v1
- Date: Mon, 1 Mar 2021 07:18:32 GMT
- Title: MFST: Multi-Features Siamese Tracker
- Authors: Zhenxi Li, Guillaume-Alexandre Bilodeau, Wassim Bouachir
- Abstract summary: Multi-Features Siamese Tracker (MFST) is a novel tracking algorithm exploiting several hierarchical feature maps for robust deep similarity tracking.
MFST achieves high tracking accuracy, while outperforming several state-of-the-art trackers, including standard Siamese trackers.
- Score: 13.850110645060116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Siamese trackers have recently achieved interesting results due to their
balance between accuracy and speed. This success is mainly due to the fact that
deep similarity networks were specifically designed to address the image
similarity problem. Therefore, they are inherently more appropriate than
classical CNNs for the tracking task. However, Siamese trackers rely on the
last convolutional layers for similarity analysis and target search, which
restricts their performance. In this paper, we argue that using a single
convolutional layer as feature representation is not the optimal choice within
the deep similarity framework, as multiple convolutional layers provide several
abstraction levels in characterizing an object. Starting from this motivation,
we present the Multi-Features Siamese Tracker (MFST), a novel tracking
algorithm exploiting several hierarchical feature maps for robust deep
similarity tracking. MFST proceeds by fusing hierarchical features to ensure a
richer and more efficient representation. Moreover, we handle appearance
variation by calibrating deep features extracted from two different CNN models.
Based on this advanced feature representation, our algorithm achieves high
tracking accuracy, while outperforming several state-of-the-art trackers,
including standard Siamese trackers. The code and trained models are available
at https://github.com/zhenxili96/MFST.
Related papers
- Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - Multi-attention Associate Prediction Network for Visual Tracking [3.9628431811908533]
classification-regression prediction networks have realized impressive success in several modern deep trackers.
There is an inherent difference between classification and regression tasks, so they have diverse even opposite demands for feature matching.
We propose a multi-attention associate prediction network (MAPNet) to tackle the above problems.
arXiv Detail & Related papers (2024-03-25T03:18:58Z) - Correlation-Embedded Transformer Tracking: A Single-Branch Framework [69.0798277313574]
We propose a novel single-branch tracking framework inspired by the transformer.
Unlike the Siamese-like feature extraction, our tracker deeply embeds cross-image feature correlation in multiple layers of the feature network.
The output features can be directly used for predicting target locations without additional correlation steps.
arXiv Detail & Related papers (2024-01-23T13:20:57Z) - Improving Siamese Based Trackers with Light or No Training through Multiple Templates and Temporal Network [0.0]
We propose a framework with two ideas on Siamese-based trackers.
(i) Extending number of templates in a way that removes the need to retrain the network.
(ii) a lightweight temporal network with a novel architecture focusing on both local and global information.
arXiv Detail & Related papers (2022-11-24T22:07:33Z) - Unsupervised Learning of Accurate Siamese Tracking [68.58171095173056]
We present a novel unsupervised tracking framework, in which we can learn temporal correspondence both on the classification branch and regression branch.
Our tracker outperforms preceding unsupervised methods by a substantial margin, performing on par with supervised methods on large-scale datasets such as TrackingNet and LaSOT.
arXiv Detail & Related papers (2022-04-04T13:39:43Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Multiple Convolutional Features in Siamese Networks for Object Tracking [13.850110645060116]
Multiple Features-Siamese Tracker (MFST) is a novel tracking algorithm exploiting several hierarchical feature maps for robust tracking.
MFST achieves high tracking accuracy, while outperforming the standard siamese tracker on object tracking benchmarks.
arXiv Detail & Related papers (2021-03-01T08:02:27Z) - Coarse-to-Fine Object Tracking Using Deep Features and Correlation
Filters [2.3526458707956643]
This paper presents a novel deep learning tracking algorithm.
We exploit the generalization ability of deep features to coarsely estimate target translation.
Then, we capitalize on the discriminative power of correlation filters to precisely localize the tracked object.
arXiv Detail & Related papers (2020-12-23T16:43:21Z) - Learning Spatio-Appearance Memory Network for High-Performance Visual
Tracking [79.80401607146987]
Existing object tracking usually learns a bounding-box based template to match visual targets across frames, which cannot accurately learn a pixel-wise representation.
This paper presents a novel segmentation-based tracking architecture, which is equipped with a local-temporal memory network to learn accurate-temporal correspondence.
arXiv Detail & Related papers (2020-09-21T08:12:02Z) - Segment as Points for Efficient Online Multi-Object Tracking and
Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation.
Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images.
The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.