Rethinking Counting and Localization in Crowds:A Purely Point-Based
Framework
- URL: http://arxiv.org/abs/2107.12746v1
- Date: Tue, 27 Jul 2021 11:41:50 GMT
- Title: Rethinking Counting and Localization in Crowds:A Purely Point-Based
Framework
- Authors: Qingyu Song, Changan Wang, Zhengkai Jiang, Yabiao Wang, Ying Tai,
Chengjie Wang, Jilin Li, Feiyue Huang, Yang Wu
- Abstract summary: We propose a purely point-based framework for joint crowd counting and individual localization.
We design an intuitive solution under this framework, which is called Point to Point Network (P2PNet)
- Score: 59.578339075658995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Localizing individuals in crowds is more in accordance with the practical
demands of subsequent high-level crowd analysis tasks than simply counting.
However, existing localization based methods relying on intermediate
representations (\textit{i.e.}, density maps or pseudo boxes) serving as
learning targets are counter-intuitive and error-prone. In this paper, we
propose a purely point-based framework for joint crowd counting and individual
localization. For this framework, instead of merely reporting the absolute
counting error at image level, we propose a new metric, called density
Normalized Average Precision (nAP), to provide more comprehensive and more
precise performance evaluation. Moreover, we design an intuitive solution under
this framework, which is called Point to Point Network (P2PNet). P2PNet
discards superfluous steps and directly predicts a set of point proposals to
represent heads in an image, being consistent with the human annotation
results. By thorough analysis, we reveal the key step towards implementing such
a novel idea is to assign optimal learning targets for these proposals.
Therefore, we propose to conduct this crucial association in an one-to-one
matching manner using the Hungarian algorithm. The P2PNet not only
significantly surpasses state-of-the-art methods on popular counting
benchmarks, but also achieves promising localization accuracy. The codes will
be available at: https://github.com/TencentYoutuResearch/CrowdCounting-P2PNet.
Related papers
- Dense Center-Direction Regression for Object Counting and Localization with Point Supervision [1.9526430269580954]
We propose a novel approach termed CeDiRNet for point-supervised learning.
It uses a dense regression of directions pointing towards the nearest object centers.
We show that it outperforms the existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-26T17:49:27Z) - CPR++: Object Localization via Single Coarse Point Supervision [55.8671776333499]
coarse point refinement (CPR) is first attempt to alleviate semantic variance from an algorithmic perspective.
CPR reduces semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.
CPR++ can obtain scale information and further reduce the semantic variance in a global region.
arXiv Detail & Related papers (2024-01-30T17:38:48Z) - P2Net: A Post-Processing Network for Refining Semantic Segmentation of
LiDAR Point Cloud based on Consistency of Consecutive Frames [25.63934234109252]
We present a lightweight post-processing method to refine semantic segmentation results of point cloud sequences.
The network, which we call the P2Net, learns the consistency constraints between coincident points from consecutive frames after registration.
The effectiveness of the proposed method is validated by comparing the results predicted by two representative networks with and without the refinement by the post-processing network.
arXiv Detail & Related papers (2022-12-01T15:13:38Z) - Dense Point Prediction: A Simple Baseline for Crowd Counting and
Localization [17.92958745980573]
We propose a simple yet effective crowd counting and localization network named SCALNet.
We consider those tasks as a pixel-wise dense prediction problem and integrate them into an end-to-end framework.
Experiments on the recent and large-scale benchmark, NWPU-Crowd, show that our approach outperforms the state-of-the-art methods by more than 5% and 10% improvement in crowd localization and counting tasks, respectively.
arXiv Detail & Related papers (2021-04-26T12:08:08Z) - A Self-Training Approach for Point-Supervised Object Detection and
Counting in Crowds [54.73161039445703]
We propose a novel self-training approach that enables a typical object detector trained only with point-level annotations.
During training, we utilize the available point annotations to supervise the estimation of the center points of objects.
Experimental results show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks.
arXiv Detail & Related papers (2020-07-25T02:14:42Z) - Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN.
In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.
In the second stage, for each guided point, different visual feature is extracted by the localization.
The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - LFD-ProtoNet: Prototypical Network Based on Local Fisher Discriminant
Analysis for Few-shot Learning [98.64231310584614]
The prototypical network (ProtoNet) is a few-shot learning framework that performs metric learning and classification using the distance to prototype representations of each class.
We show the usefulness of the proposed method by theoretically providing an expected risk bound and empirically demonstrating its superior classification accuracy on miniImageNet and tieredImageNet.
arXiv Detail & Related papers (2020-06-15T11:56:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.