Related papers: DynamicTrack: Advancing Gigapixel Tracking in Crowded Scenes

DynamicTrack: Advancing Gigapixel Tracking in Crowded Scenes

URL: http://arxiv.org/abs/2407.18637v1
Date: Fri, 26 Jul 2024 10:08:01 GMT
Title: DynamicTrack: Advancing Gigapixel Tracking in Crowded Scenes
Authors: Yunqi Zhao, Yuchen Guo, Zheng Cao, Kai Ni, Ruqi Huang, Lu Fang,
Abstract summary: We introduce DynamicTrack, a dynamic tracking framework designed to address gigapixel tracking challenges in crowded scenes. In particular, we propose a dynamic detector that utilizes contrastive learning to jointly detect the head and body of pedestrians.
Score: 29.98165509387273
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tracking in gigapixel scenarios holds numerous potential applications in video surveillance and pedestrian analysis. Existing algorithms attempt to perform tracking in crowded scenes by utilizing multiple cameras or group relationships. However, their performance significantly degrades when confronted with complex interaction and occlusion inherent in gigapixel images. In this paper, we introduce DynamicTrack, a dynamic tracking framework designed to address gigapixel tracking challenges in crowded scenes. In particular, we propose a dynamic detector that utilizes contrastive learning to jointly detect the head and body of pedestrians. Building upon this, we design a dynamic association algorithm that effectively utilizes head and body information for matching purposes. Extensive experiments show that our tracker achieves state-of-the-art performance on widely used tracking benchmarks specifically designed for gigapixel crowded scenes.

Related papers

MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos [104.1338295060383]
We present a system that allows for accurate, fast, and robust estimation of camera parameters and depth maps from casual monocular videos of dynamic scenes. Our system is significantly more accurate and robust at camera pose and depth estimation when compared with prior and concurrent work.
arXiv Detail & Related papers (2024-12-05T18:59:42Z)
DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild [85.03973683867797]
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild. We show that the proposed method achieves state-of-the-art performance in terms of camera pose estimation even in complex dynamic challenge scenes.
arXiv Detail & Related papers (2024-11-20T13:01:16Z)
DenseTrack: Drone-based Crowd Tracking via Density-aware Motion-appearance Synergy [33.57923199717605]
Drone-based crowd tracking faces difficulties in accurately identifying and monitoring objects from an aerial perspective. To address these challenges, we present the Density-aware Tracking (DenseTrack) framework. DenseTrack capitalizes on crowd counting to precisely determine object locations, blending visual and motion cues to improve the tracking of small-scale objects.
arXiv Detail & Related papers (2024-07-24T13:39:07Z)
EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving [64.58258341591929]
Auditory Referring Multi-Object Tracking (AR-MOT) is a challenging problem in autonomous driving. We put forward EchoTrack, an end-to-end AR-MOT framework with dual-stream vision transformers. We establish the first set of large-scale AR-MOT benchmarks.
arXiv Detail & Related papers (2024-02-28T12:50:16Z)
Distractor-aware Event-based Tracking [45.07711356111249]
We propose a distractor-aware event-based tracker that introduces transformer modules into Siamese network architecture (named DANet) Our model is mainly composed of a motion-aware network and a target-aware network, which simultaneously exploits both motion cues and object contours from event data. Our DANet can be trained in an end-to-end manner without any post-processing and can run at over 80 FPS on a single V100.
arXiv Detail & Related papers (2023-10-22T05:50:20Z)
Graph-Based Multi-Camera Soccer Player Tracker [1.6244541005112743]
The paper presents a multi-camera tracking method intended for tracking soccer players in long shot video recordings from multiple calibrated cameras installed around the playing field. The large distance to the camera makes it difficult to visually distinguish individual players, which adversely affects the performance of traditional solutions. Our method focuses on individual player dynamics and interactions between neighborhood players to improve tracking performance.
arXiv Detail & Related papers (2022-11-03T20:01:48Z)
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild [57.37891682117178]
We present a robust dense indirect structure-from-motion method for videos that is based on dense correspondence from pairwise optical flow. A novel neural network architecture is proposed for processing irregular point trajectory data. Experiments on MPI Sintel dataset show that our system produces significantly more accurate camera trajectories.
arXiv Detail & Related papers (2022-07-19T09:19:45Z)
Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams. Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z)
Indoor Navigation Assistance for Visually Impaired People via Dynamic SLAM and Panoptic Segmentation with an RGB-D Sensor [25.36354262588248]
We propose an assistive system with an RGB-D sensor to detect dynamic information of a scene. With sparse feature points extracted from images, poses of the user can be estimated. poses and speed of tracked dynamic objects can be estimated, which are passed to the users through acoustic feedback.
arXiv Detail & Related papers (2022-04-03T20:19:15Z)
Self-supervised Human Detection and Segmentation via Multi-view Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training. We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z)
Tracking-by-Counting: Using Network Flows on Crowd Density Maps for Tracking Multiple Targets [96.98888948518815]
State-of-the-art multi-object tracking(MOT) methods follow the tracking-by-detection paradigm. We propose a new MOT paradigm, tracking-by-counting, tailored for crowded scenes.
arXiv Detail & Related papers (2020-07-18T19:51:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.