Pyramid Correlation based Deep Hough Voting for Visual Object Tracking
- URL: http://arxiv.org/abs/2110.07994v1
- Date: Fri, 15 Oct 2021 10:37:00 GMT
- Title: Pyramid Correlation based Deep Hough Voting for Visual Object Tracking
- Authors: Ying Wang and Tingfa Xu and Jianan Li and Shenwang Jiang and Junjie
Chen
- Abstract summary: We introduce a voting-based classification-only tracking algorithm named Pyramid Correlation based Deep Hough Voting (short for PCDHV)
Specifically we innovatively construct a Pyramid Correlation module to equip the embedded feature with fine-grained local structures and global spatial contexts.
The elaborately designed Deep Hough Voting module further take over, integrating long-range dependencies of pixels to perceive corners.
- Score: 16.080776515556686
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most of the existing Siamese-based trackers treat tracking problem as a
parallel task of classification and regression. However, some studies show that
the sibling head structure could lead to suboptimal solutions during the
network training. Through experiments we find that, without regression, the
performance could be equally promising as long as we delicately design the
network to suit the training objective. We introduce a novel voting-based
classification-only tracking algorithm named Pyramid Correlation based Deep
Hough Voting (short for PCDHV), to jointly locate the top-left and bottom-right
corners of the target. Specifically we innovatively construct a Pyramid
Correlation module to equip the embedded feature with fine-grained local
structures and global spatial contexts; The elaborately designed Deep Hough
Voting module further take over, integrating long-range dependencies of pixels
to perceive corners; In addition, the prevalent discretization gap is simply
yet effectively alleviated by increasing the spatial resolution of the feature
maps while exploiting channel-space relationships. The algorithm is general,
robust and simple. We demonstrate the effectiveness of the module through a
series of ablation experiments. Without bells and whistles, our tracker
achieves better or comparable performance to the SOTA algorithms on three
challenging benchmarks (TrackingNet, GOT-10k and LaSOT) while running at a
real-time speed of 80 FPS. Codes and models will be released.
Related papers
- Hierarchical IoU Tracking based on Interval [21.555469501789577]
Multi-Object Tracking (MOT) aims to detect and associate all targets of given classes across frames.
We propose the Hierarchical IoU Tracking framework, dubbed HIT, which achieves unified hierarchical tracking by utilizing tracklet intervals as priors.
Our method achieves promising performance on four datasets, i.e., MOT17, KITTI, DanceTrack and VisDrone.
arXiv Detail & Related papers (2024-06-19T07:03:18Z) - CBAGAN-RRT: Convolutional Block Attention Generative Adversarial Network
for Sampling-Based Path Planning [0.0]
We propose a novel image-based learning algorithm (CBAGAN-RRT) using a Convolutional Block Attention Generative Adversarial Network.
The probability distribution of the paths generated from our GAN model is used to guide the sampling process for the RRT algorithm.
We train and test our network on the dataset generated by citezhang 2021 and demonstrate that our algorithm outperforms the previous state-of-the-art algorithms.
arXiv Detail & Related papers (2023-05-13T20:06:53Z) - A Bayesian Detect to Track System for Robust Visual Object Tracking and
Semi-Supervised Model Learning [1.7268829007643391]
We ad-dress problems in a Bayesian tracking and detection framework parameterized by neural network outputs.
We propose a particle filter-based approximate sampling algorithm for tracking object state estimation.
Based on our particle filter inference algorithm, a semi-supervised learn-ing algorithm is utilized for learning tracking network on intermittent labeled frames.
arXiv Detail & Related papers (2022-05-05T00:18:57Z) - Efficient Person Search: An Anchor-Free Approach [86.45858994806471]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
To achieve this goal, state-of-the-art models typically add a re-id branch upon two-stage detectors like Faster R-CNN.
In this work, we present an anchor-free approach to efficiently tackling this challenging task, by introducing the following dedicated designs.
arXiv Detail & Related papers (2021-09-01T07:01:33Z) - Video-based Person Re-identification without Bells and Whistles [49.51670583977911]
Video-based person re-identification (Re-ID) aims at matching the video tracklets with cropped video frames for identifying the pedestrians under different cameras.
There exists severe spatial and temporal misalignment for those cropped tracklets due to the imperfect detection and tracking results generated with obsolete methods.
We present a simple re-Detect and Link (DL) module which can effectively reduce those unexpected noise through applying the deep learning-based detection and tracking on the cropped tracklets.
arXiv Detail & Related papers (2021-05-22T10:17:38Z) - Coarse-to-Fine Object Tracking Using Deep Features and Correlation
Filters [2.3526458707956643]
This paper presents a novel deep learning tracking algorithm.
We exploit the generalization ability of deep features to coarsely estimate target translation.
Then, we capitalize on the discriminative power of correlation filters to precisely localize the tracked object.
arXiv Detail & Related papers (2020-12-23T16:43:21Z) - Learning Spatio-Appearance Memory Network for High-Performance Visual
Tracking [79.80401607146987]
Existing object tracking usually learns a bounding-box based template to match visual targets across frames, which cannot accurately learn a pixel-wise representation.
This paper presents a novel segmentation-based tracking architecture, which is equipped with a local-temporal memory network to learn accurate-temporal correspondence.
arXiv Detail & Related papers (2020-09-21T08:12:02Z) - Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence.
First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN.
Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z) - Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale.
Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous.
We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.