Related papers: Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking

Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking

URL: http://arxiv.org/abs/2312.08951v1
Date: Thu, 14 Dec 2023 14:00:30 GMT
Title: Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking
Authors: Yan Gao, Haojun Xu, Nannan Wang, Jie Li, Xinbo Gao
Abstract summary: Composite Node Message Passing Network (CoNo-Link) is a framework for modeling ultra-long frames information for association. In addition to the previous method of treating objects as nodes, the network innovatively treats object trajectories as nodes for information interaction. Our model can learn better predictions on longer-time scales by adding composite nodes.
Score: 61.69892497726235
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The global multi-object tracking (MOT) system can consider interaction, occlusion, and other ``visual blur'' scenarios to ensure effective object tracking in long videos. Among them, graph-based tracking-by-detection paradigms achieve surprising performance. However, their fully-connected nature poses storage space requirements that challenge algorithm handling long videos. Currently, commonly used methods are still generated trajectories by building one-forward associations across frames. Such matches produced under the guidance of first-order similarity information may not be optimal from a longer-time perspective. Moreover, they often lack an end-to-end scheme for correcting mismatches. This paper proposes the Composite Node Message Passing Network (CoNo-Link), a multi-scene generalized framework for modeling ultra-long frames information for association. CoNo-Link's solution is a low-storage overhead method for building constrained connected graphs. In addition to the previous method of treating objects as nodes, the network innovatively treats object trajectories as nodes for information interaction, improving the graph neural network's feature representation capability. Specifically, we formulate the graph-building problem as a top-k selection task for some reliable objects or trajectories. Our model can learn better predictions on longer-time scales by adding composite nodes. As a result, our method outperforms the state-of-the-art in several commonly used datasets.

Related papers

Spatio-temporal Graph Learning on Adaptive Mined Key Frames for High-performance Multi-Object Tracking [5.746443489229576]
Key Frame Extraction (KFE) module leverages reinforcement learning to adaptively segment videos. Intra-Frame Feature Fusion (IFF) module uses a Graph Convolutional Network (GCN) to facilitate information exchange between the target and surrounding objects. Our proposed tracker achieves impressive results on the MOT17 dataset.
arXiv Detail & Related papers (2025-01-17T11:36:38Z)
Learning Long Range Dependencies on Graphs via Random Walks [6.7864586321550595]
Message-passing graph neural networks (GNNs) excel at capturing local relationships but struggle with long-range dependencies in graphs. graph transformers (GTs) enable global information exchange but often oversimplify the graph structure by representing graphs as sets of fixed-length vectors. This work introduces a novel architecture that overcomes the shortcomings of both approaches by combining the long-range information of random walks with local message passing.
arXiv Detail & Related papers (2024-06-05T15:36:57Z)
ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking [11.619493960418176]
Multi-Camera Multi-Object Tracking (MC-MOT) utilizes information from multiple views to better handle problems with occlusion and crowded scenes. Current graph-based methods do not effectively utilize information regarding spatial and temporal consistency. We propose a novel reconfigurable graph model that first associates all detected objects across cameras spatially before reconfiguring it into a temporal graph.
arXiv Detail & Related papers (2023-08-25T08:02:04Z)
Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation. The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects. Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z)
Learnable Graph Matching: A Practical Paradigm for Data Association [74.28753343714858]
We propose a general learnable graph matching method to address these issues. Our method achieves state-of-the-art performance on several MOT datasets. For image matching, our method outperforms state-of-the-art methods on a popular indoor dataset, ScanNet.
arXiv Detail & Related papers (2023-03-27T17:39:00Z)
Unifying Short and Long-Term Tracking with Graph Hierarchies [0.0]
We introduce SUSHI, a unified and scalable multi-object tracker. Our approach processes long clips by splitting them into a hierarchy of subclips, which enables high scalability. We leverage graph neural networks to process all levels of the hierarchy, which makes our model unified across temporal scales and highly general.
arXiv Detail & Related papers (2022-12-06T15:12:53Z)
Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision. A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z)
Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking [58.30147362745852]
Data association across frames is at the core of Multiple Object Tracking (MOT) task. Existing methods mostly ignore the context information among tracklets and intra-frame detections. We propose a novel learnable graph matching method to address these issues.
arXiv Detail & Related papers (2021-03-30T08:58:45Z)
Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking. Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z)
TrackMPNN: A Message Passing Graph Neural Architecture for Multi-Object Tracking [8.791710193028903]
This study follows many previous approaches to multi-object tracking (MOT) that model the problem using graph-based data structures. We create a framework based on dynamic undirected graphs that represent the data association problem over multiple timesteps. We also provide solutions and propositions for the computational problems that need to be addressed to create a memory-efficient, real-time, online algorithm.
arXiv Detail & Related papers (2021-01-11T21:52:25Z)
GCNNMatch: Graph Convolutional Neural Networks for Multi-Object Tracking via Sinkhorn Normalization [5.705895203925818]
This paper proposes a novel method for online Multi-Object Tracking (MOT) using Graph Convolutional Neural Network (GCNN) based feature extraction and end-to-end feature matching for object association. The Graph based approach incorporates both appearance and geometry of objects at past frames as well as the current frame into the task of feature learning.
arXiv Detail & Related papers (2020-09-30T19:18:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.