Spatial-Temporal Multi-Cuts for Online Multiple-Camera Vehicle Tracking
- URL: http://arxiv.org/abs/2410.02638v1
- Date: Thu, 3 Oct 2024 16:23:33 GMT
- Title: Spatial-Temporal Multi-Cuts for Online Multiple-Camera Vehicle Tracking
- Authors: Fabian Herzog, Johannes Gilg, Philipp Wolters, Torben Teepe, Gerhard Rigoll,
- Abstract summary: We introduce a graph representation that allows spatial-temporal clustering in a single, combined step.
By keeping sparse appearance and positional cues of all detections in a cluster, our method can compare clusters based on the strongest available evidence.
Our method does not require any training on the target scene, pre-extraction of single-camera tracks, or additional annotations.
- Score: 5.679775668038154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate online multiple-camera vehicle tracking is essential for intelligent transportation systems, autonomous driving, and smart city applications. Like single-camera multiple-object tracking, it is commonly formulated as a graph problem of tracking-by-detection. Within this framework, existing online methods usually consist of two-stage procedures that cluster temporally first, then spatially, or vice versa. This is computationally expensive and prone to error accumulation. We introduce a graph representation that allows spatial-temporal clustering in a single, combined step: New detections are spatially and temporally connected with existing clusters. By keeping sparse appearance and positional cues of all detections in a cluster, our method can compare clusters based on the strongest available evidence. The final tracks are obtained online using a simple multicut assignment procedure. Our method does not require any training on the target scene, pre-extraction of single-camera tracks, or additional annotations. Notably, we outperform the online state-of-the-art on the CityFlow dataset in terms of IDF1 by more than 14%, and on the Synthehicle dataset by more than 25%, respectively. The code is publicly available.
Related papers
- Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple
Object Tracking [73.52284039530261]
We present Quasi-Dense Similarity Learning, which densely samples hundreds of object regions on a pair of images for contrastive learning.
We find that the resulting distinctive feature space admits a simple nearest neighbor search at inference time for object association.
We show that our similarity learning scheme is not limited to video data, but can learn effective instance similarity even from static input.
arXiv Detail & Related papers (2022-10-12T15:47:36Z) - Exploring Simple 3D Multi-Object Tracking for Autonomous Driving [10.921208239968827]
3D multi-object tracking in LiDAR point clouds is a key ingredient for self-driving vehicles.
Existing methods are predominantly based on the tracking-by-detection pipeline and inevitably require a matching step for the detection association.
We present SimTrack to simplify the hand-crafted tracking paradigm by proposing an end-to-end trainable model for joint detection and tracking from raw point clouds.
arXiv Detail & Related papers (2021-08-23T17:59:22Z) - Online Clustering-based Multi-Camera Vehicle Tracking in Scenarios with
overlapping FOVs [2.6365690297272617]
Multi-Target Multi-Camera (MTMC) vehicle tracking is an essential task of visual traffic monitoring.
We present a new low-latency online approach for MTMC tracking in scenarios with partially overlapping fields of view.
arXiv Detail & Related papers (2021-02-08T09:55:55Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - MVLidarNet: Real-Time Multi-Class Scene Understanding for Autonomous
Driving Using Multiple Views [60.538802124885414]
We present Multi-View LidarNet (MVLidarNet), a two-stage deep neural network for multi-class object detection and drivable space segmentation.
MVLidarNet is able to detect and classify objects while simultaneously determining the drivable space using a single LiDAR scan as input.
We show results on both KITTI and a much larger internal dataset, thus demonstrating the method's ability to scale by an order of magnitude.
arXiv Detail & Related papers (2020-06-09T21:28:17Z) - RetinaTrack: Online Single Stage Joint Detection and Tracking [22.351109024452462]
We focus on the tracking-by-detection paradigm for autonomous driving where both tasks are mission critical.
We propose a conceptually simple and efficient joint model of detection and tracking, called RetinaTrack, which modifies the popular single stage RetinaNet approach.
arXiv Detail & Related papers (2020-03-30T23:46:29Z) - Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted
Multicuts [11.72025865314187]
We present an unsupervised multiple object tracking approach based on minimum visual features and lifted multicuts.
We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking.
arXiv Detail & Related papers (2020-02-04T09:42:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.