MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark
- URL: http://arxiv.org/abs/2403.20225v1
- Date: Fri, 29 Mar 2024 15:08:37 GMT
- Title: MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark
- Authors: Sanghyun Woo, Kwanyong Park, Inkyu Shin, Myungchul Kim, In So Kweon,
- Abstract summary: Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras.
Existing datasets for this task are either synthetically generated or artificially constructed within a controlled camera network setting.
We present MTMMC, a real-world, large-scale dataset that includes long video sequences captured by 16 multi-modal cameras in two different environments.
- Score: 63.878793340338035
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras. This task has practical applications in various fields, such as visual surveillance, crowd behavior analysis, and anomaly detection. However, due to the difficulty and cost of collecting and labeling data, existing datasets for this task are either synthetically generated or artificially constructed within a controlled camera network setting, which limits their ability to model real-world dynamics and generalize to diverse camera configurations. To address this issue, we present MTMMC, a real-world, large-scale dataset that includes long video sequences captured by 16 multi-modal cameras in two different environments - campus and factory - across various time, weather, and season conditions. This dataset provides a challenging test-bed for studying multi-camera tracking under diverse real-world complexities and includes an additional input modality of spatially aligned and temporally synchronized RGB and thermal cameras, which enhances the accuracy of multi-camera tracking. MTMMC is a super-set of existing datasets, benefiting independent fields such as person detection, re-identification, and multiple object tracking. We provide baselines and new learning setups on this dataset and set the reference scores for future studies. The datasets, models, and test server will be made publicly available.
Related papers
- MCTR: Multi Camera Tracking Transformer [45.66952089591361]
Multi-Camera Tracking tRansformer (MCTR) is a novel end-to-end approach tailored for multi-object detection and tracking across multiple cameras.
MCTR leverages end-to-end detectors like DEtector TRansformer (DETR) to produce detections and detection embeddings independently for each camera view.
The framework maintains set of track embeddings that encaplusate global information about the tracked objects, and updates them at every frame by integrating local information from the view-specific detection embeddings.
arXiv Detail & Related papers (2024-08-23T17:37:03Z) - GMT: A Robust Global Association Model for Multi-Target Multi-Camera Tracking [13.305411087116635]
We propose a global online MTMC tracking model that addresses the dependency on the first tracking stage in two-step methods and enhances cross-camera matching.
Specifically, we propose a transformer-based global MTMC association module to explore target associations across different cameras and frames.
To accommodate high scene diversity and complex lighting condition variations, we have established the VisionTrack dataset.
arXiv Detail & Related papers (2024-07-01T06:39:14Z) - Learning Online Policies for Person Tracking in Multi-View Environments [4.62316736194615]
We introduce MVSparse, a novel framework for cooperative multi-person tracking across multiple synchronized cameras.
The MVSparse system is comprised of a carefully orchestrated pipeline, combining edge server-based models with distributed lightweight Reinforcement Learning (RL) agents.
Notably, our contributions include an empirical analysis of multi-camera pedestrian tracking datasets, the development of a multi-camera, multi-person detection pipeline, and the implementation of MVSparse.
arXiv Detail & Related papers (2023-12-26T02:57:11Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Synthehicle: Multi-Vehicle Multi-Camera Tracking in Virtual Cities [4.4855664250147465]
We present a massive synthetic dataset for multiple vehicle tracking and segmentation in multiple overlapping and non-overlapping camera views.
The dataset consists of 17 hours of labeled video material, recorded from 340 cameras in 64 diverse day, rain, dawn, and night scenes.
arXiv Detail & Related papers (2022-08-30T11:36:07Z) - Scalable and Real-time Multi-Camera Vehicle Detection,
Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams.
Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.