DETR for Crowd Pedestrian Detection
- URL: http://arxiv.org/abs/2012.06785v3
- Date: Thu, 18 Feb 2021 09:46:22 GMT
- Title: DETR for Crowd Pedestrian Detection
- Authors: Matthieu Lin and Chuming Li and Xingyuan Bu and Ming Sun and Chen Lin
and Junjie Yan and Wanli Ouyang and Zhidong Deng
- Abstract summary: The proposed detector PED(Pedestrian End-to-end Detector) outperforms both previous EDs and the baseline Faster-RCNN on CityPersons and CrowdHuman.
It also achieves comparable performance with state-of-the-art pedestrian detection methods.
- Score: 114.00860636622949
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pedestrian detection in crowd scenes poses a challenging problem due to the
heuristic defined mapping from anchors to pedestrians and the conflict between
NMS and highly overlapped pedestrians. The recently proposed end-to-end
detectors(ED), DETR and deformable DETR, replace hand designed components such
as NMS and anchors using the transformer architecture, which gets rid of
duplicate predictions by computing all pairwise interactions between queries.
Inspired by these works, we explore their performance on crowd pedestrian
detection. Surprisingly, compared to Faster-RCNN with FPN, the results are
opposite to those obtained on COCO. Furthermore, the bipartite match of ED
harms the training efficiency due to the large ground truth number in crowd
scenes. In this work, we identify the underlying motives driving ED's poor
performance and propose a new decoder to address them. Moreover, we design a
mechanism to leverage the less occluded visible parts of pedestrian
specifically for ED, and achieve further improvements. A faster bipartite match
algorithm is also introduced to make ED training on crowd dataset more
practical. The proposed detector PED(Pedestrian End-to-end Detector)
outperforms both previous EDs and the baseline Faster-RCNN on CityPersons and
CrowdHuman. It also achieves comparable performance with state-of-the-art
pedestrian detection methods. Code will be released soon.
Related papers
- A PST Algorithm for FPs Suppression in Two-stage CNN Detection Methods [2.288618928064061]
This paper proposes a pedestrian-sensitive training algorithm to help two-stage CNN detection methods learn to distinguish the pedestrian and non-pedestrian samples.
With the help of the proposed algorithm, the detection accuracy of the MetroNext, an smaller and accurate metro passenger detector, is further improved.
arXiv Detail & Related papers (2024-05-24T08:26:14Z) - PSDiff: Diffusion Model for Person Search with Iterative and
Collaborative Refinement [59.6260680005195]
We present a novel Person Search framework based on the Diffusion model, PSDiff.
PSDiff formulates the person search as a dual denoising process from noisy boxes and ReID embeddings to ground truths.
Following the new paradigm, we further design a new Collaborative Denoising Layer (CDL) to optimize detection and ReID sub-tasks in an iterative and collaborative way.
arXiv Detail & Related papers (2023-09-20T08:16:39Z) - DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion [137.8749239614528]
We propose a new formulation of temporal action detection (TAD) with denoising diffusion, DiffTAD.
Taking as input random temporal proposals, it can yield action proposals accurately given an untrimmed long video.
arXiv Detail & Related papers (2023-03-27T00:40:52Z) - Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning [64.78972193105443]
This paper presents a novel AE detection framework, named trustworthy for predictions.
performs the detection by distinguishing the AE's abnormal relation with its augmented versions.
An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label.
arXiv Detail & Related papers (2022-08-31T08:18:44Z) - Variational Pedestrian Detection [33.52588723666144]
We develop a unique perspective of pedestrian detection as a variational inference problem.
We formulate a novel and efficient algorithm for pedestrian detection by modeling the dense proposals as a latent variable.
Our method can also be flexibly applied to two-stage detectors, achieving notable performance enhancement.
arXiv Detail & Related papers (2021-04-26T08:06:41Z) - Visible Feature Guidance for Crowd Pedestrian Detection [12.8128512764041]
We propose Visible Feature Guidance (VFG) for both training and inference.
During training, we adopt visible feature to regress the simultaneous outputs of visible bounding box and full bounding box.
Then we perform NMS only on visible bounding boxes to achieve the best fitting full box in inference.
arXiv Detail & Related papers (2020-08-23T08:52:52Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - NMS by Representative Region: Towards Crowded Pedestrian Detection by
Proposal Pairing [25.050500817717108]
The heavy occlusion between pedestrians imposes great challenges to the standard Non-Maximum Suppression (NMS)
This paper proposes a novel Representative Region NMS approach leveraging the less occluded visible parts, effectively removing the redundant boxes without bringing in many false positives.
Experiments on the challenging CrowdHuman and CityPersons benchmarks sufficiently validate the effectiveness of the proposed approach on pedestrian detection in the crowded situation.
arXiv Detail & Related papers (2020-03-28T06:33:54Z) - Detection in Crowded Scenes: One Proposal, Multiple Predictions [79.28850977968833]
We propose a proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.
The key of our approach is to let each proposal predict a set of correlated instances rather than a single one in previous proposal-based frameworks.
Our detector can obtain 4.9% AP gains on challenging CrowdHuman dataset and 1.0% $textMR-2$ improvements on CityPersons dataset.
arXiv Detail & Related papers (2020-03-20T09:48:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.