Generalizable Pedestrian Detection: The Elephant In The Room
- URL: http://arxiv.org/abs/2003.08799v7
- Date: Wed, 9 Dec 2020 08:56:09 GMT
- Title: Generalizable Pedestrian Detection: The Elephant In The Room
- Authors: Irtiza Hasan, Shengcai Liao, Jinpeng Li, Saad Ullah Akram, and Ling
Shao
- Abstract summary: We find that existing state-of-the-art pedestrian detectors, though perform quite well when trained and tested on the same dataset, generalize poorly in cross dataset evaluation.
We illustrate that diverse and dense datasets, collected by crawling the web, serve to be an efficient source of pre-training for pedestrian detection.
- Score: 82.37430109152383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pedestrian detection is used in many vision based applications ranging from
video surveillance to autonomous driving. Despite achieving high performance,
it is still largely unknown how well existing detectors generalize to unseen
data. This is important because a practical detector should be ready to use in
various scenarios in applications. To this end, we conduct a comprehensive
study in this paper, using a general principle of direct cross-dataset
evaluation. Through this study, we find that existing state-of-the-art
pedestrian detectors, though perform quite well when trained and tested on the
same dataset, generalize poorly in cross dataset evaluation. We demonstrate
that there are two reasons for this trend. Firstly, their designs (e.g. anchor
settings) may be biased towards popular benchmarks in the traditional
single-dataset training and test pipeline, but as a result largely limit their
generalization capability. Secondly, the training source is generally not dense
in pedestrians and diverse in scenarios. Under direct cross-dataset evaluation,
surprisingly, we find that a general purpose object detector, without
pedestrian-tailored adaptation in design, generalizes much better compared to
existing state-of-the-art pedestrian detectors. Furthermore, we illustrate that
diverse and dense datasets, collected by crawling the web, serve to be an
efficient source of pre-training for pedestrian detection. Accordingly, we
propose a progressive training pipeline and find that it works well for
autonomous-driving oriented pedestrian detection. Consequently, the study
conducted in this paper suggests that more emphasis should be put on
cross-dataset evaluation for the future design of generalizable pedestrian
detectors. Code and models can be accessed at
https://github.com/hasanirtiza/Pedestron.
Related papers
- Mean Height Aided Post-Processing for Pedestrian Detection [9.654938705603312]
We take the perspective effect in pedestrian datasets as an example and propose the mean height aided suppression for post-processing.
The proposed method is easy to implement and is plug-and-play.
The combination of mean height aided suppression with particular detectors outperforms state-of-the-art pedestrian detectors on Caltech and Citypersons datasets.
arXiv Detail & Related papers (2024-08-24T18:20:47Z) - Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank [51.66174565170112]
We propose a novel approach to construct versatile pedestrian knowledge bank.
We extract pedestrian knowledge from a large-scale pretrained model.
We then curate them by quantizing most representative features and guiding them to be distinguishable from background scenes.
arXiv Detail & Related papers (2024-04-30T07:01:05Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Pedestrian Detection: Domain Generalization, CNNs, Transformers and
Beyond [82.37430109152383]
We show that, current pedestrian detectors poorly handle even small domain shifts in cross-dataset evaluation.
We attribute the limited generalization to two main factors, the method and the current sources of data.
We propose a progressive fine-tuning strategy which improves generalization.
arXiv Detail & Related papers (2022-01-10T06:00:26Z) - Self-supervised Audiovisual Representation Learning for Remote Sensing Data [96.23611272637943]
We propose a self-supervised approach for pre-training deep neural networks in remote sensing.
By exploiting the correspondence between geo-tagged audio recordings and remote sensing, this is done in a completely label-free manner.
We show that our approach outperforms existing pre-training strategies for remote sensing imagery.
arXiv Detail & Related papers (2021-08-02T07:50:50Z) - Benchmarking Intent Detection for Task-Oriented Dialog Systems [6.54201796167054]
Intent detection is a key component of modern goal-oriented dialog systems that accomplish a user task by predicting the intent of users' text input.
There are three primary challenges in designing robust and accurate intent detection models.
Our results show that Watson Assistant's intent detection model outperforms other commercial solutions.
arXiv Detail & Related papers (2020-12-07T18:58:57Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - RetinaTrack: Online Single Stage Joint Detection and Tracking [22.351109024452462]
We focus on the tracking-by-detection paradigm for autonomous driving where both tasks are mission critical.
We propose a conceptually simple and efficient joint model of detection and tracking, called RetinaTrack, which modifies the popular single stage RetinaNet approach.
arXiv Detail & Related papers (2020-03-30T23:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.