DR.VIC: Decomposition and Reasoning for Video Individual Counting
- URL: http://arxiv.org/abs/2203.12335v1
- Date: Wed, 23 Mar 2022 11:24:44 GMT
- Title: DR.VIC: Decomposition and Reasoning for Video Individual Counting
- Authors: Tao Han, Lei Bai, Junyu Gao, Qi Wang, Wanli Ouyang
- Abstract summary: We propose to conduct pedestrian counting from a new perspective - Video Individual Counting (VIC)
Instead of relying on the Multiple Object Tracking (MOT) techniques, we propose to solve the problem by decomposing all pedestrians into the initial pedestrians who existed in the first frame and the new pedestrians with separate identities in each following frame.
An end-to-end Decomposition and Reasoning Network (DRNet) is designed to predict the initial pedestrian count with the density estimation method and reason the new pedestrian's count of each frame with the differentiable optimal transport.
- Score: 93.12166351940242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pedestrian counting is a fundamental tool for understanding pedestrian
patterns and crowd flow analysis. Existing works (e.g., image-level pedestrian
counting, crossline crowd counting et al.) either only focus on the image-level
counting or are constrained to the manual annotation of lines. In this work, we
propose to conduct the pedestrian counting from a new perspective - Video
Individual Counting (VIC), which counts the total number of individual
pedestrians in the given video (a person is only counted once). Instead of
relying on the Multiple Object Tracking (MOT) techniques, we propose to solve
the problem by decomposing all pedestrians into the initial pedestrians who
existed in the first frame and the new pedestrians with separate identities in
each following frame. Then, an end-to-end Decomposition and Reasoning Network
(DRNet) is designed to predict the initial pedestrian count with the density
estimation method and reason the new pedestrian's count of each frame with the
differentiable optimal transport. Extensive experiments are conducted on two
datasets with congested pedestrians and diverse scenes, demonstrating the
effectiveness of our method over baselines with great superiority in counting
the individual pedestrians. Code: https://github.com/taohan10200/DRNet.
Related papers
- Multiview Detection with Cardboard Human Modeling [23.072791405965415]
We propose a new pedestrian representation scheme based on human point clouds modeling.
Specifically, using ray tracing for holistic human depth estimation, we model pedestrians as upright, thin cardboard point clouds on the ground.
arXiv Detail & Related papers (2022-07-05T12:47:26Z) - STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded
Scenes [78.95447086305381]
Accurately detecting and tracking pedestrians in 3D space is challenging due to large variations in rotations, poses and scales.
Existing benchmarks either only provide 2D annotations, or have limited 3D annotations with low-density pedestrian distribution.
We introduce a large-scale multimodal dataset, STCrowd, to better evaluate pedestrian perception algorithms in crowded scenarios.
arXiv Detail & Related papers (2022-04-03T08:26:07Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - Pedestrian Intention Prediction: A Multi-task Perspective [83.7135926821794]
In order to be globally deployed, autonomous cars must guarantee the safety of pedestrians.
This work tries to solve this problem by jointly predicting the intention and visual states of pedestrians.
The method is a recurrent neural network in a multi-task learning approach.
arXiv Detail & Related papers (2020-10-20T13:42:31Z) - Completely Self-Supervised Crowd Counting via Distribution Matching [92.09218454377395]
We propose a complete self-supervision approach to training models for dense crowd counting.
The only input required to train, apart from a large set of unlabeled crowd images, is the approximate upper limit of the crowd count.
Our method dwells on the idea that natural crowds follow a power law distribution, which could be leveraged to yield error signals for backpropagation.
arXiv Detail & Related papers (2020-09-14T13:20:12Z) - Do Not Disturb Me: Person Re-identification Under the Interference of
Other Pedestrians [97.45805377769354]
This paper presents a novel deep network termed Pedestrian-Interference Suppression Network (PISNet)
PISNet leverages a Query-Guided Attention Block (QGAB) to enhance the feature of the target in the gallery, under the guidance of the query.
Our method is evaluated on two new pedestrian-interference datasets and the results show that the proposed method performs favorably against existing Re-ID methods.
arXiv Detail & Related papers (2020-08-16T17:45:14Z) - The Pedestrian Patterns Dataset [11.193504036335503]
The dataset was collected by repeatedly traversing the same three routes for one week starting at different specific timeslots.
The purpose of the dataset is to capture the patterns of social and pedestrian behavior along the traversed routes at different times.
arXiv Detail & Related papers (2020-01-06T23:58:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.