Blending of Learning-based Tracking and Object Detection for Monocular
Camera-based Target Following
- URL: http://arxiv.org/abs/2008.09644v1
- Date: Fri, 21 Aug 2020 18:44:35 GMT
- Title: Blending of Learning-based Tracking and Object Detection for Monocular
Camera-based Target Following
- Authors: Pranoy Panda, Martin Barczyk
- Abstract summary: We present a real-time approach which fuses a generic target tracker and object detection module with a target re-identification module.
Our work focuses on improving the performance of Convolutional Recurrent Neural Network-based object trackers in cases where the object of interest belongs to the category of emphfamiliar objects.
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has recently started being applied to visual tracking of
generic objects in video streams. For the purposes of robotics applications, it
is very important for a target tracker to recover its track if it is lost due
to heavy or prolonged occlusions or motion blur of the target. We present a
real-time approach which fuses a generic target tracker and object detection
module with a target re-identification module. Our work focuses on improving
the performance of Convolutional Recurrent Neural Network-based object trackers
in cases where the object of interest belongs to the category of
\emph{familiar} objects. Our proposed approach is sufficiently lightweight to
track objects at 85-90 FPS while attaining competitive results on challenging
benchmarks.
Related papers
- Leveraging Object Priors for Point Tracking [25.030407197192]
Point tracking is a fundamental problem in computer vision with numerous applications in AR and robotics.
We propose a novel objectness regularization approach that guides points to be aware of object priors.
Our approach achieves state-of-the-art performance on three point tracking benchmarks.
arXiv Detail & Related papers (2024-09-09T16:48:42Z) - Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors [16.84474849409625]
We propose a framework for consistently producing high-quality object tracks.
The key idea is to tailor a module for each dataset to intelligently decide when an object tracker is failing.
Our approach leverages self-supervised learning on unlabeled videos to learn a tailored representation for a target object.
arXiv Detail & Related papers (2024-05-06T17:06:32Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR
based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection.
We first observe that experienced human annotators annotate objects from a track-centric perspective.
We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z) - MotionTrack: Learning Robust Short-term and Long-term Motions for
Multi-Object Tracking [56.92165669843006]
We propose MotionTrack, which learns robust short-term and long-term motions in a unified framework to associate trajectories from a short to long range.
For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target.
For extreme occlusions, we build a novel Refind Module to learn reliable long-term motions from the target's history trajectory, which can link the interrupted trajectory with its corresponding detection.
arXiv Detail & Related papers (2023-03-18T12:38:33Z) - Real-time Multi-Object Tracking Based on Bi-directional Matching [0.0]
This study offers a bi-directional matching algorithm for multi-object tracking.
A stranded area is used in the matching algorithm to temporarily store the objects that fail to be tracked.
In the MOT17 challenge, the proposed algorithm achieves 63.4% MOTA, 55.3% IDF1, and 20.1 FPS tracking speed.
arXiv Detail & Related papers (2023-03-15T08:38:08Z) - Learning to Track Object Position through Occlusion [32.458623495840904]
Occlusion is one of the most significant challenges encountered by object detectors and trackers.
We propose a tracking-by-detection approach that builds upon the success of region based video object detectors.
Our approach achieves superior results on a dataset of furniture assembly videos collected from the internet.
arXiv Detail & Related papers (2021-06-20T22:29:46Z) - Learning Target Candidate Association to Keep Track of What Not to Track [100.80610986625693]
We propose to keep track of distractor objects in order to continue tracking the target.
To tackle the problem of lacking ground-truth correspondences between distractor objects in visual tracking, we propose a training strategy that combines partial annotations with self-supervision.
Our tracker sets a new state-of-the-art on six benchmarks, achieving an AUC score of 67.2% on LaSOT and a +6.1% absolute gain on the OxUvA long-term dataset.
arXiv Detail & Related papers (2021-03-30T17:58:02Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.