Intelligent Querying for Target Tracking in Camera Networks using Deep
Q-Learning with n-Step Bootstrapping
- URL: http://arxiv.org/abs/2004.09632v1
- Date: Mon, 20 Apr 2020 20:49:52 GMT
- Title: Intelligent Querying for Target Tracking in Camera Networks using Deep
Q-Learning with n-Step Bootstrapping
- Authors: Anil Sharma, Saket Anand, and Sanjit K. Kaul
- Abstract summary: We formulate the target tracking problem in a camera network as an MDP and learn a reinforcement learning based policy that selects a camera for making a re-identification query.
The proposed approach to camera selection does not assume the knowledge of the camera network topology but the resulting policy implicitly learns it.
- Score: 11.221084462863894
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Surveillance camera networks are a useful infrastructure for various visual
analytics applications, where high-level inferences and predictions could be
made based on target tracking across the network. Most multi-camera tracking
works focus on target re-identification and trajectory association problems to
track the target. However, since camera networks can generate enormous amount
of video data, inefficient schemes for making re-identification or trajectory
association queries can incur prohibitively large computational requirements.
In this paper, we address the problem of intelligent scheduling of
re-identification queries in a multi-camera tracking setting. To this end, we
formulate the target tracking problem in a camera network as an MDP and learn a
reinforcement learning based policy that selects a camera for making a
re-identification query. The proposed approach to camera selection does not
assume the knowledge of the camera network topology but the resulting policy
implicitly learns it. We have also shown that such a policy can be learnt
directly from data. Using the NLPR MCT and the Duke MTMC multi-camera
multi-target tracking benchmarks, we empirically show that the proposed
approach substantially reduces the number of frames queried.
Related papers
- Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking
and Segmentation [31.167405688707575]
We propose a framework for instance-level visual analysis on video frames.
It can simultaneously conduct object detection, instance segmentation, and multi-object tracking.
We evaluate the proposed method extensively on KITTI MOTS and MOTS Challenge datasets.
arXiv Detail & Related papers (2023-11-02T04:32:24Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Learning to Select Camera Views: Efficient Multiview Understanding at
Few Glances [59.34619548026885]
We propose a view selection approach that analyzes the target object or scenario from given views and selects the next best view for processing.
Our approach features a reinforcement learning based camera selection module, MVSelect, that not only selects views but also facilitates joint training with the task network.
arXiv Detail & Related papers (2023-03-10T18:59:10Z) - Tracking Passengers and Baggage Items using Multiple Overhead Cameras at
Security Checkpoints [2.021502591596062]
We introduce a novel framework to track multiple objects in overhead camera videos for airport checkpoint security scenarios.
We propose a Self-Supervised Learning (SSL) technique to provide the model information about instance segmentation uncertainty from overhead images.
Our results show that self-supervision improves object detection accuracy by up to $42%$ without increasing the inference time of the model.
arXiv Detail & Related papers (2022-12-31T12:57:09Z) - Scalable and Real-time Multi-Camera Vehicle Detection,
Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams.
Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Graph Neural Networks for Cross-Camera Data Association [3.490148531239259]
Cross-camera image data association is essential for many multi-camera computer vision tasks.
This paper proposes an efficient approach for cross-cameras data-association focused on a global solution.
arXiv Detail & Related papers (2022-01-17T09:52:39Z) - Multi-target tracking for video surveillance using deep affinity
network: a brief review [0.0]
Multi-target tracking (MTT) for video surveillance is one of the important and challenging tasks.
Deep learning models are known to function like the human brain.
arXiv Detail & Related papers (2021-10-29T10:44:26Z) - Tracking by Joint Local and Global Search: A Target-aware Attention
based Approach [63.50045332644818]
We propose a novel target-aware attention mechanism (termed TANet) to conduct joint local and global search for robust tracking.
Specifically, we extract the features of target object patch and continuous video frames, then we track and feed them into a decoder network to generate target-aware global attention maps.
In the tracking procedure, we integrate the target-aware attention with multiple trackers by exploring candidate search regions for robust tracking.
arXiv Detail & Related papers (2021-06-09T06:54:15Z) - TDIOT: Target-driven Inference for Deep Video Object Tracking [0.2457872341625575]
In this work, we adopt the pre-trained Mask R-CNN deep object detector as the baseline.
We introduce a novel inference architecture placed on top of FPN-ResNet101 backbone of Mask R-CNN to jointly perform detection and tracking.
The proposed single object tracker, TDIOT, applies an appearance similarity-based temporal matching for data association.
arXiv Detail & Related papers (2021-03-19T20:45:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.