Multi-target tracking for video surveillance using deep affinity
network: a brief review
- URL: http://arxiv.org/abs/2110.15674v1
- Date: Fri, 29 Oct 2021 10:44:26 GMT
- Title: Multi-target tracking for video surveillance using deep affinity
network: a brief review
- Authors: Sanam Nisar Mangi
- Abstract summary: Multi-target tracking (MTT) for video surveillance is one of the important and challenging tasks.
Deep learning models are known to function like the human brain.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models are known to function like the human brain. Due to their
functional mechanism, they are frequently utilized to accomplish tasks that
require human intelligence. Multi-target tracking (MTT) for video surveillance
is one of the important and challenging tasks, which has attracted the
researcher's attention due to its potential applications in various domains.
Multi-target tracking tasks require locating the objects individually in each
frame, which remains a huge challenge as there are immediate changes in
appearances and extreme occlusions of objects. In addition to that, the
Multitarget tracking framework requires multiple tasks to perform i.e. target
detection, estimating trajectory, associations between frame, and
re-identification. Various methods have been suggested, and some assumptions
are made to constrain the problem in the context of a particular problem. In
this paper, the state-of-the-art MTT models, which leverage from deep learning
representational power are reviewed.
Related papers
- VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking [61.56592503861093]
This issue amalgamates the complexities of open-vocabulary object detection (OVD) and multi-object tracking (MOT)
Existing approaches to OVMOT often merge OVD and MOT methodologies as separate modules, predominantly focusing on the problem through an image-centric lens.
We propose VOVTrack, a novel method that integrates object states relevant to MOT and video-centric training to address this challenge from a video object tracking standpoint.
arXiv Detail & Related papers (2024-10-11T05:01:49Z) - Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking
and Segmentation [31.167405688707575]
We propose a framework for instance-level visual analysis on video frames.
It can simultaneously conduct object detection, instance segmentation, and multi-object tracking.
We evaluate the proposed method extensively on KITTI MOTS and MOTS Challenge datasets.
arXiv Detail & Related papers (2023-11-02T04:32:24Z) - MotionTrack: Learning Robust Short-term and Long-term Motions for
Multi-Object Tracking [56.92165669843006]
We propose MotionTrack, which learns robust short-term and long-term motions in a unified framework to associate trajectories from a short to long range.
For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target.
For extreme occlusions, we build a novel Refind Module to learn reliable long-term motions from the target's history trajectory, which can link the interrupted trajectory with its corresponding detection.
arXiv Detail & Related papers (2023-03-18T12:38:33Z) - Unifying Tracking and Image-Video Object Detection [54.91658924277527]
TrIVD (Tracking and Image-Video Detection) is the first framework that unifies image OD, video OD, and MOT within one end-to-end model.
To handle the discrepancies and semantic overlaps of category labels, TrIVD formulates detection/tracking as grounding and reasons about object categories.
arXiv Detail & Related papers (2022-11-20T20:30:28Z) - Multiple Object Tracking in Recent Times: A Literature Review [0.0]
Multiple object tracking has become one of the trending problems in computer vision.
Mot is one of the critical vision tasks for different issues like occlusion in crowded scenes, similar appearance, small object detection difficulty, ID switching, etc.
We have studied more than a hundred papers published over the last three years and have tried to extract the techniques that are more focused on by researchers in recent times to solve the problems of MOT.
arXiv Detail & Related papers (2022-09-11T06:12:28Z) - Multi-modal Transformers Excel at Class-agnostic Object Detection [105.10403103027306]
We argue that existing methods lack a top-down supervision signal governed by human-understandable semantics.
We develop an efficient and flexible MViT architecture using multi-scale feature processing and deformable self-attention.
We show the significance of MViT proposals in a diverse range of applications.
arXiv Detail & Related papers (2021-11-22T18:59:29Z) - Multi-object tracking with self-supervised associating network [5.947279761429668]
We propose a novel self-supervised learning method using a lot of short videos which has no human labeling.
Despite the re-identification network is trained in a self-supervised manner, it achieves the state-of-the-art performance of MOTA 62.0% and IDF1 62.6% on the MOT17 test benchmark.
arXiv Detail & Related papers (2020-10-26T08:48:23Z) - Survey on Reliable Deep Learning-Based Person Re-Identification Models:
Are We There Yet? [19.23187114221822]
Person re-identification (PReID) is one of the most critical problems in intelligent video-surveillance (IVS)
Deep neural networks (DNNs) given their compelling performance on similar vision problems and fast execution at test time.
We present descriptions of each model along with their evaluation on a set of benchmark datasets.
arXiv Detail & Related papers (2020-04-30T16:09:16Z) - MOPT: Multi-Object Panoptic Tracking [33.77171216778909]
We introduce a novel perception task denoted as multi-object panoptic tracking (MOPT)
MOPT allows for exploiting pixel-level semantic information of 'thing' and'stuff' classes, temporal coherence, and pixel-level associations over time.
We present extensive quantitative and qualitative evaluations of both vision-based and LiDAR-based MOPT that demonstrate encouraging results.
arXiv Detail & Related papers (2020-04-17T11:45:28Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.