DynamicTrack: Advancing Gigapixel Tracking in Crowded Scenes
- URL: http://arxiv.org/abs/2407.18637v1
- Date: Fri, 26 Jul 2024 10:08:01 GMT
- Title: DynamicTrack: Advancing Gigapixel Tracking in Crowded Scenes
- Authors: Yunqi Zhao, Yuchen Guo, Zheng Cao, Kai Ni, Ruqi Huang, Lu Fang,
- Abstract summary: We introduce DynamicTrack, a dynamic tracking framework designed to address gigapixel tracking challenges in crowded scenes.
In particular, we propose a dynamic detector that utilizes contrastive learning to jointly detect the head and body of pedestrians.
- Score: 29.98165509387273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tracking in gigapixel scenarios holds numerous potential applications in video surveillance and pedestrian analysis. Existing algorithms attempt to perform tracking in crowded scenes by utilizing multiple cameras or group relationships. However, their performance significantly degrades when confronted with complex interaction and occlusion inherent in gigapixel images. In this paper, we introduce DynamicTrack, a dynamic tracking framework designed to address gigapixel tracking challenges in crowded scenes. In particular, we propose a dynamic detector that utilizes contrastive learning to jointly detect the head and body of pedestrians. Building upon this, we design a dynamic association algorithm that effectively utilizes head and body information for matching purposes. Extensive experiments show that our tracker achieves state-of-the-art performance on widely used tracking benchmarks specifically designed for gigapixel crowded scenes.
Related papers
- Analysis of Unstructured High-Density Crowded Scenes for Crowd Monitoring [55.2480439325792]
We are interested in developing an automated system for detection of organized movements in human crowds.
Computer vision algorithms can extract information from videos of crowded scenes.
We can estimate the number of participants in an organized cohort.
arXiv Detail & Related papers (2024-08-06T22:09:50Z) - DenseTrack: Drone-based Crowd Tracking via Density-aware Motion-appearance Synergy [33.57923199717605]
Drone-based crowd tracking faces difficulties in accurately identifying and monitoring objects from an aerial perspective.
To address these challenges, we present the Density-aware Tracking (DenseTrack) framework.
DenseTrack capitalizes on crowd counting to precisely determine object locations, blending visual and motion cues to improve the tracking of small-scale objects.
arXiv Detail & Related papers (2024-07-24T13:39:07Z) - Motion Segmentation for Neuromorphic Aerial Surveillance [42.04157319642197]
Event cameras offer superior temporal resolution, superior dynamic range, and minimal power requirements.
Unlike traditional frame-based sensors that capture redundant information at fixed intervals, event cameras asynchronously record pixel-level brightness changes.
We introduce a novel motion segmentation method that leverages self-supervised vision transformers on both event data and optical flow information.
arXiv Detail & Related papers (2024-05-24T04:36:13Z) - EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving [64.58258341591929]
Auditory Referring Multi-Object Tracking (AR-MOT) is a challenging problem in autonomous driving.
We put forward EchoTrack, an end-to-end AR-MOT framework with dual-stream vision transformers.
We establish the first set of large-scale AR-MOT benchmarks.
arXiv Detail & Related papers (2024-02-28T12:50:16Z) - Distractor-aware Event-based Tracking [45.07711356111249]
We propose a distractor-aware event-based tracker that introduces transformer modules into Siamese network architecture (named DANet)
Our model is mainly composed of a motion-aware network and a target-aware network, which simultaneously exploits both motion cues and object contours from event data.
Our DANet can be trained in an end-to-end manner without any post-processing and can run at over 80 FPS on a single V100.
arXiv Detail & Related papers (2023-10-22T05:50:20Z) - Graph-Based Multi-Camera Soccer Player Tracker [1.6244541005112743]
The paper presents a multi-camera tracking method intended for tracking soccer players in long shot video recordings from multiple calibrated cameras installed around the playing field.
The large distance to the camera makes it difficult to visually distinguish individual players, which adversely affects the performance of traditional solutions.
Our method focuses on individual player dynamics and interactions between neighborhood players to improve tracking performance.
arXiv Detail & Related papers (2022-11-03T20:01:48Z) - ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow.
A novel neural network architecture is proposed for processing irregular point trajectory data.
Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z) - Scalable and Real-time Multi-Camera Vehicle Detection,
Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams.
Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z) - Indoor Navigation Assistance for Visually Impaired People via Dynamic
SLAM and Panoptic Segmentation with an RGB-D Sensor [25.36354262588248]
We propose an assistive system with an RGB-D sensor to detect dynamic information of a scene.
With sparse feature points extracted from images, poses of the user can be estimated.
poses and speed of tracked dynamic objects can be estimated, which are passed to the users through acoustic feedback.
arXiv Detail & Related papers (2022-04-03T20:19:15Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - Tracking-by-Counting: Using Network Flows on Crowd Density Maps for
Tracking Multiple Targets [96.98888948518815]
State-of-the-art multi-object tracking(MOT) methods follow the tracking-by-detection paradigm.
We propose a new MOT paradigm, tracking-by-counting, tailored for crowded scenes.
arXiv Detail & Related papers (2020-07-18T19:51:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.