Detection in Crowded Scenes: One Proposal, Multiple Predictions
- URL: http://arxiv.org/abs/2003.09163v2
- Date: Wed, 24 Jun 2020 14:59:34 GMT
- Title: Detection in Crowded Scenes: One Proposal, Multiple Predictions
- Authors: Xuangeng Chu, Anlin Zheng, Xiangyu Zhang, Jian Sun
- Abstract summary: We propose a proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.
The key of our approach is to let each proposal predict a set of correlated instances rather than a single one in previous proposal-based frameworks.
Our detector can obtain 4.9% AP gains on challenging CrowdHuman dataset and 1.0% $textMR-2$ improvements on CityPersons dataset.
- Score: 79.28850977968833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a simple yet effective proposal-based object detector, aiming at
detecting highly-overlapped instances in crowded scenes. The key of our
approach is to let each proposal predict a set of correlated instances rather
than a single one in previous proposal-based frameworks. Equipped with new
techniques such as EMD Loss and Set NMS, our detector can effectively handle
the difficulty of detecting highly overlapped objects. On a FPN-Res50 baseline,
our detector can obtain 4.9\% AP gains on challenging CrowdHuman dataset and
1.0\% $\text{MR}^{-2}$ improvements on CityPersons dataset, without bells and
whistles. Moreover, on less crowed datasets like COCO, our approach can still
achieve moderate improvement, suggesting the proposed method is robust to
crowdedness. Code and pre-trained models will be released at
https://github.com/megvii-model/CrowdDetection.
Related papers
- Bayesian Detector Combination for Object Detection with Crowdsourced Annotations [49.43709660948812]
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise.
We propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations.
BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models.
arXiv Detail & Related papers (2024-07-10T18:00:54Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Monte Carlo Linear Clustering with Single-Point Supervision is Enough
for Infrared Small Target Detection [48.707233614642796]
Single-frame infrared small target (SIRST) detection aims at separating small targets from clutter backgrounds on infrared images.
Deep learning based methods have achieved promising performance on SIRST detection, but at the cost of a large amount of training data.
We propose the first method to achieve SIRST detection with single-point supervision.
arXiv Detail & Related papers (2023-04-10T08:04:05Z) - Progressive End-to-End Object Detection in Crowded Scenes [96.92416613336096]
Previous query-based detectors suffer from two drawbacks: first, multiple predictions will be inferred for a single object, typically in crowded scenes; second, the performance saturates as the depth of the decoding stage increases.
We propose a progressive predicting method to address the above issues. Specifically, we first select accepted queries to generate true positive predictions, then refine the rest noisy queries according to the previously accepted predictions.
Experiments show that our method can significantly boost the performance of query-based detectors in crowded scenes.
arXiv Detail & Related papers (2022-03-15T06:12:00Z) - Video-based Person Re-identification without Bells and Whistles [49.51670583977911]
Video-based person re-identification (Re-ID) aims at matching the video tracklets with cropped video frames for identifying the pedestrians under different cameras.
There exists severe spatial and temporal misalignment for those cropped tracklets due to the imperfect detection and tracking results generated with obsolete methods.
We present a simple re-Detect and Link (DL) module which can effectively reduce those unexpected noise through applying the deep learning-based detection and tracking on the cropped tracklets.
arXiv Detail & Related papers (2021-05-22T10:17:38Z) - Learning a Proposal Classifier for Multiple Object Tracking [36.67900094433032]
We propose a novel proposal-based learnable framework, which models MOT as a proposal generation, proposal scoring and trajectory inference paradigm on an affinity graph.
We experimentally demonstrate that the proposed method achieves a clear performance improvement in both MOTA and IDF1 with respect to previous state-of-the-art on two public benchmarks.
arXiv Detail & Related papers (2021-03-14T10:46:54Z) - A Systematic Evaluation of Object Detection Networks for Scientific
Plots [17.882932963813985]
We train and compare the accuracy of various SOTA object detection networks on the PlotQA dataset.
At the standard IOU setting of 0.5, most networks perform well with mAP scores greater than 80% in detecting the relatively simple objects in plots.
However, the performance drops drastically when evaluated at a stricter IOU of 0.9 with the best model giving a mAP of 35.70%.
arXiv Detail & Related papers (2020-07-05T05:30:53Z) - Frustratingly Simple Few-Shot Object Detection [98.42824677627581]
We find that fine-tuning only the last layer of existing detectors on rare classes is crucial to the few-shot object detection task.
Such a simple approach outperforms the meta-learning methods by roughly 220 points on current benchmarks.
arXiv Detail & Related papers (2020-03-16T00:29:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.