Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey
- URL: http://arxiv.org/abs/2205.10766v2
- Date: Tue, 12 Mar 2024 16:29:18 GMT
- Title: Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey
- Authors: Gaoang Wang, Mingli Song, Jenq-Neng Hwang
- Abstract summary: Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories.
Embedding methods play an essential role in object location estimation and temporal identity association in MOT.
We first conduct a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives.
- Score: 71.10448142010422
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-object tracking (MOT) aims to associate target objects across video
frames in order to obtain entire moving trajectories. With the advancement of
deep neural networks and the increasing demand for intelligent video analysis,
MOT has gained significantly increased interest in the computer vision
community. Embedding methods play an essential role in object location
estimation and temporal identity association in MOT. Unlike other computer
vision tasks, such as image classification, object detection,
re-identification, and segmentation, embedding methods in MOT have large
variations, and they have never been systematically analyzed and summarized. In
this survey, we first conduct a comprehensive overview with in-depth analysis
for embedding methods in MOT from seven different perspectives, including
patch-level embedding, single-frame embedding, cross-frame joint embedding,
correlation embedding, sequential embedding, tracklet embedding, and
cross-track relational embedding. We further summarize the existing widely used
MOT datasets and analyze the advantages of existing state-of-the-art methods
according to their embedding strategies. Finally, some critical yet
under-investigated areas and future research directions are discussed.
Related papers
- VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking [61.56592503861093]
This issue amalgamates the complexities of open-vocabulary object detection (OVD) and multi-object tracking (MOT)
Existing approaches to OVMOT often merge OVD and MOT methodologies as separate modules, predominantly focusing on the problem through an image-centric lens.
We propose VOVTrack, a novel method that integrates object states relevant to MOT and video-centric training to address this challenge from a video object tracking standpoint.
arXiv Detail & Related papers (2024-10-11T05:01:49Z) - Transformer Network for Multi-Person Tracking and Re-Identification in
Unconstrained Environment [0.6798775532273751]
Multi-object tracking (MOT) has profound applications in a variety of fields, including surveillance, sports analytics, self-driving, and cooperative robotics.
We put forward an integrated MOT method that marries object detection and identity linkage within a singular, end-to-end trainable framework.
Our system leverages a robust memory-temporal memory module that retains extensive historical observations and effectively encodes them using an attention-based aggregator.
arXiv Detail & Related papers (2023-12-19T08:15:22Z) - 3D Multiple Object Tracking on Autonomous Driving: A Literature Review [25.568952977339]
3D multi-object tracking (3D MOT) stands as a pivotal domain within autonomous driving.
Despite its paramount significance, 3D MOT confronts a myriad of formidable challenges.
arXiv Detail & Related papers (2023-09-27T05:32:26Z) - UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with
Geometric Topology Guidance [6.577227592760559]
UnsMOT is a novel framework that combines appearance and motion features of objects with geometric information to provide more accurate tracking.
Experimental results show remarkable performance in terms of HOTA, IDF1, and MOTA metrics in comparison with state-of-the-art methods.
arXiv Detail & Related papers (2023-09-03T04:58:12Z) - Unifying Tracking and Image-Video Object Detection [54.91658924277527]
TrIVD (Tracking and Image-Video Detection) is the first framework that unifies image OD, video OD, and MOT within one end-to-end model.
To handle the discrepancies and semantic overlaps of category labels, TrIVD formulates detection/tracking as grounding and reasons about object categories.
arXiv Detail & Related papers (2022-11-20T20:30:28Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Deep Learning on Monocular Object Pose Detection and Tracking: A
Comprehensive Overview [8.442460766094674]
Object pose detection and tracking has attracted increasing attention due to its wide applications in many areas, such as autonomous driving, robotics, and augmented reality.
Deep learning is the most promising one that has shown better performance than others.
This paper presents a comprehensive review of recent progress in object pose detection and tracking that belongs to the deep learning technical route.
arXiv Detail & Related papers (2021-05-29T12:59:29Z) - RGB-D Railway Platform Monitoring and Scene Understanding for Enhanced
Passenger Safety [3.4298729855744026]
This paper proposes a flexible analysis scheme to detect and track humans on a ground plane.
We consider multiple combinations within a set of RGB- and depth-based detection and tracking modalities.
Results indicate that the combined use of depth-based spatial information and learned representations yields substantially enhanced detection and tracking accuracies.
arXiv Detail & Related papers (2021-02-23T14:44:34Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z) - Visual Tracking by TridentAlign and Context Embedding [71.60159881028432]
We propose novel TridentAlign and context embedding modules for Siamese network-based visual tracking methods.
The performance of the proposed tracker is comparable to that of state-of-the-art trackers, while the proposed tracker runs at real-time speed.
arXiv Detail & Related papers (2020-07-14T08:00:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.