DyGLIP: A Dynamic Graph Model with Link Prediction for Accurate
Multi-Camera Multiple Object Tracking
- URL: http://arxiv.org/abs/2106.06856v1
- Date: Sat, 12 Jun 2021 20:22:30 GMT
- Title: DyGLIP: A Dynamic Graph Model with Link Prediction for Accurate
Multi-Camera Multiple Object Tracking
- Authors: Kha Gia Quach, Pha Nguyen, Huu Le, Thanh-Dat Truong, Chi Nhan Duong,
Minh-Triet Tran, Khoa Luu
- Abstract summary: Multi-Camera Multiple Object Tracking (MC-MOT) is a significant computer vision problem due to its emerging applicability in several real-world applications.
This work proposes a new Dynamic Graph Model with Link Prediction approach to solve the data association task.
Experimental results show that we outperform existing MC-MOT algorithms by a large margin on several practical datasets.
- Score: 25.98400206361454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Camera Multiple Object Tracking (MC-MOT) is a significant computer
vision problem due to its emerging applicability in several real-world
applications. Despite a large number of existing works, solving the data
association problem in any MC-MOT pipeline is arguably one of the most
challenging tasks. Developing a robust MC-MOT system, however, is still highly
challenging due to many practical issues such as inconsistent lighting
conditions, varying object movement patterns, or the trajectory occlusions of
the objects between the cameras. To address these problems, this work,
therefore, proposes a new Dynamic Graph Model with Link Prediction (DyGLIP)
approach to solve the data association task. Compared to existing methods, our
new model offers several advantages, including better feature representations
and the ability to recover from lost tracks during camera transitions.
Moreover, our model works gracefully regardless of the overlapping ratios
between the cameras. Experimental results show that we outperform existing
MC-MOT algorithms by a large margin on several practical datasets. Notably, our
model works favorably on online settings but can be extended to an incremental
approach for large-scale datasets.
Related papers
- GMT: A Robust Global Association Model for Multi-Target Multi-Camera Tracking [13.305411087116635]
We propose a global online MTMC tracking model that addresses the dependency on the first tracking stage in two-step methods and enhances cross-camera matching.
Specifically, we propose a transformer-based global MTMC association module to explore target associations across different cameras and frames.
To accommodate high scene diversity and complex lighting condition variations, we have established the VisionTrack dataset.
arXiv Detail & Related papers (2024-07-01T06:39:14Z) - Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion [18.138433117711177]
We propose a novel multimodal hybrid tracker (MMHT) that utilizes frame-event-based data for reliable single object tracking.
The MMHT model employs a hybrid backbone consisting of an artificial neural network (ANN) and a spiking neural network (SNN) to extract dominant features from different visual modalities.
Extensive experiments demonstrate that the MMHT model exhibits competitive performance in comparison with other state-of-the-art methods.
arXiv Detail & Related papers (2024-05-28T07:24:56Z) - MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark [63.878793340338035]
Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras.
Existing datasets for this task are either synthetically generated or artificially constructed within a controlled camera network setting.
We present MTMMC, a real-world, large-scale dataset that includes long video sequences captured by 16 multi-modal cameras in two different environments.
arXiv Detail & Related papers (2024-03-29T15:08:37Z) - Multi-Scene Generalized Trajectory Global Graph Solver with Composite
Nodes for Multiple Object Tracking [61.69892497726235]
Composite Node Message Passing Network (CoNo-Link) is a framework for modeling ultra-long frames information for association.
In addition to the previous method of treating objects as nodes, the network innovatively treats object trajectories as nodes for information interaction.
Our model can learn better predictions on longer-time scales by adding composite nodes.
arXiv Detail & Related papers (2023-12-14T14:00:30Z) - TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models [75.20168902300166]
We propose TrackDiffusion, a novel video generation framework affording fine-grained trajectory-conditioned motion control.
A pivotal component of TrackDiffusion is the instance enhancer, which explicitly ensures inter-frame consistency of multiple objects.
generated video sequences by our TrackDiffusion can be used as training data for visual perception models.
arXiv Detail & Related papers (2023-12-01T15:24:38Z) - ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera
Multi-Object Tracking [11.619493960418176]
Multi-Camera Multi-Object Tracking (MC-MOT) utilizes information from multiple views to better handle problems with occlusion and crowded scenes.
Current graph-based methods do not effectively utilize information regarding spatial and temporal consistency.
We propose a novel reconfigurable graph model that first associates all detected objects across cameras spatially before reconfiguring it into a temporal graph.
arXiv Detail & Related papers (2023-08-25T08:02:04Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z) - Multi-object Monocular SLAM for Dynamic Environments [12.537311048732017]
The term multibody, implies that we track the motion of the camera, as well as that of other dynamic participants in the scene.
Existing approaches solve restricted variants of the problem, but the solutions suffer relative scale ambiguity.
We propose a multi pose-graph optimization formulation, to resolve the relative and absolute scale factor ambiguities involved.
arXiv Detail & Related papers (2020-02-10T03:49:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.