Real-time 3D Deep Multi-Camera Tracking
- URL: http://arxiv.org/abs/2003.11753v1
- Date: Thu, 26 Mar 2020 06:08:19 GMT
- Title: Real-time 3D Deep Multi-Camera Tracking
- Authors: Quanzeng You, Hao Jiang
- Abstract summary: We propose a novel end-to-end tracking pipeline, Deep Multi-Camera Tracking (DMCT), which achieves reliable real-time multi-camera people tracking.
Our system achieves the state-of-the-art tracking results while maintaining real-time performance.
- Score: 13.494550690138775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tracking a crowd in 3D using multiple RGB cameras is a challenging task. Most
previous multi-camera tracking algorithms are designed for offline setting and
have high computational complexity. Robust real-time multi-camera 3D tracking
is still an unsolved problem. In this work, we propose a novel end-to-end
tracking pipeline, Deep Multi-Camera Tracking (DMCT), which achieves reliable
real-time multi-camera people tracking. Our DMCT consists of 1) a fast and
novel perspective-aware Deep GroudPoint Network, 2) a fusion procedure for
ground-plane occupancy heatmap estimation, 3) a novel Deep Glimpse Network for
person detection and 4) a fast and accurate online tracker. Our design fully
unleashes the power of deep neural network to estimate the "ground point" of
each person in each color image, which can be optimized to run efficiently and
robustly. Our fusion procedure, glimpse network and tracker merge the results
from different views, find people candidates using multiple video frames and
then track people on the fused heatmap. Our system achieves the
state-of-the-art tracking results while maintaining real-time performance.
Apart from evaluation on the challenging WILDTRACK dataset, we also collect two
more tracking datasets with high-quality labels from two different environments
and camera settings. Our experimental results confirm that our proposed
real-time pipeline gives superior results to previous approaches.
Related papers
- DELTA: Dense Efficient Long-range 3D Tracking for any video [82.26753323263009]
We introduce DELTA, a novel method that efficiently tracks every pixel in 3D space, enabling accurate motion estimation across entire videos.
Our approach leverages a joint global-local attention mechanism for reduced-resolution tracking, followed by a transformer-based upsampler to achieve high-resolution predictions.
Our method provides a robust solution for applications requiring fine-grained, long-term motion tracking in 3D space.
arXiv Detail & Related papers (2024-10-31T17:59:01Z) - RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework [28.359633046753228]
We propose RockTrack, a 3D MOT method for multi-camera detectors.
RockTrack incorporates a confidence-guided preprocessing module to extract reliable motion and image observations.
RockTrack achieves state-of-the-art performance on the nuScenes vision-only tracking leaderboard with 59.1% AMOTA.
arXiv Detail & Related papers (2024-09-18T07:08:08Z) - ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every
Detection Box [81.45219802386444]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames.
We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes.
In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate.
arXiv Detail & Related papers (2023-03-27T15:35:21Z) - Minkowski Tracker: A Sparse Spatio-Temporal R-CNN for Joint Object
Detection and Tracking [53.64390261936975]
We present Minkowski Tracker, a sparse-temporal R-CNN that jointly solves object detection and tracking problems.
Inspired by region-based CNN (R-CNN), we propose to track motion as a second stage of the object detector R-CNN.
We show in large-scale experiments that the overall performance gain of our method is due to four factors.
arXiv Detail & Related papers (2022-08-22T04:47:40Z) - Scalable and Real-time Multi-Camera Vehicle Detection,
Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams.
Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z) - MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People
Tracking Benchmark [40.363608495563305]
We provide a large-scale densely-labeled multi-camera tracking dataset in five different environments with the help of an auto-annotation system.
The 3D tracking results are projected to each RGB camera view using camera parameters to create 2D tracking results.
This dataset provides a more reliable benchmark of multi-camera, multi-object tracking systems in cluttered and crowded environments.
arXiv Detail & Related papers (2021-11-30T06:29:14Z) - LMGP: Lifted Multicut Meets Geometry Projections for Multi-Camera
Multi-Object Tracking [42.87953709286856]
Multi-Camera Multi-Object Tracking is currently drawing attention in the computer vision field due to its superior performance in real-world applications.
We propose a mathematically elegant multi-camera multiple object tracking approach based on a spatial-temporal lifted multicut formulation.
arXiv Detail & Related papers (2021-11-23T14:09:47Z) - CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object
Tracking [9.62721286522053]
We propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion.
Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association.
We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark.
arXiv Detail & Related papers (2021-07-11T23:56:53Z) - Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion
Forecasting with a Single Convolutional Net [93.51773847125014]
We propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor.
Our approach performs 3D convolutions across space and time over a bird's eye view representation of the 3D world.
arXiv Detail & Related papers (2020-12-22T22:43:35Z) - Tracking-by-Counting: Using Network Flows on Crowd Density Maps for
Tracking Multiple Targets [96.98888948518815]
State-of-the-art multi-object tracking(MOT) methods follow the tracking-by-detection paradigm.
We propose a new MOT paradigm, tracking-by-counting, tailored for crowded scenes.
arXiv Detail & Related papers (2020-07-18T19:51:53Z) - JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset [34.609125601292]
We present JRMOT, a novel 3D MOT system that integrates information from RGB images and 3D point clouds to achieve real-time tracking performance.
As part of our work, we release the JRDB dataset, a novel large scale 2D+3D dataset and benchmark.
The presented 3D MOT system demonstrates state-of-the-art performance against competing methods on the popular 2D tracking KITTI benchmark.
arXiv Detail & Related papers (2020-02-19T19:21:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.