SADet: Learning An Efficient and Accurate Pedestrian Detector
- URL: http://arxiv.org/abs/2007.13119v1
- Date: Sun, 26 Jul 2020 12:32:38 GMT
- Title: SADet: Learning An Efficient and Accurate Pedestrian Detector
- Authors: Chubin Zhuang and Zhen Lei and Stan Z. Li
- Abstract summary: This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
- Score: 68.66857832440897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although the anchor-based detectors have taken a big step forward in
pedestrian detection, the overall performance of algorithm still needs further
improvement for practical applications, \emph{e.g.}, a good trade-off between
the accuracy and efficiency. To this end, this paper proposes a series of
systematic optimization strategies for the detection pipeline of one-stage
detector, forming a single shot anchor-based detector (SADet) for efficient and
accurate pedestrian detection, which includes three main improvements. Firstly,
we optimize the sample generation process by assigning soft tags to the outlier
samples to generate semi-positive samples with continuous tag value between $0$
and $1$, which not only produces more valid samples, but also strengthens the
robustness of the model. Secondly, a novel Center-$IoU$ loss is applied as a
new regression loss for bounding box regression, which not only retains the
good characteristics of IoU loss, but also solves some defects of it. Thirdly,
we also design Cosine-NMS for the postprocess of predicted bounding boxes, and
further propose adaptive anchor matching to enable the model to adaptively
match the anchor boxes to full or visible bounding boxes according to the
degree of occlusion, making the NMS and anchor matching algorithms more
suitable for occluded pedestrian detection. Though structurally simple, it
presents state-of-the-art result and real-time speed of $20$ FPS for
VGA-resolution images ($640 \times 480$) on challenging pedestrian detection
benchmarks, i.e., CityPersons, Caltech, and human detection benchmark
CrowdHuman, leading to a new attractive pedestrian detector.
Related papers
- Match and Locate: low-frequency monocular odometry based on deep feature
matching [0.65268245109828]
We introduce a novel approach for the robotic odometry which only requires a single camera.
The approach is based on matching image features between the consecutive frames of the video stream using deep feature matching models.
We evaluate the performance of the approach in the AISG-SLA Visual Localisation Challenge and find that while being computationally efficient and easy to implement our method shows competitive results.
arXiv Detail & Related papers (2023-11-16T17:32:58Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection [48.66703222700795]
We resort to a novel kernel strategy to identify the most informative point clouds to acquire labels.
To accommodate both one-stage (i.e., SECOND) and two-stage detectors, we incorporate the classification entropy tangent and well trade-off between detection performance and the total number of bounding boxes selected for annotation.
Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art method.
arXiv Detail & Related papers (2023-07-16T04:27:03Z) - Efficient Decoder-free Object Detection with Transformers [75.00499377197475]
Vision transformers (ViTs) are changing the landscape of object detection approaches.
We propose a decoder-free fully transformer-based (DFFT) object detector.
DFFT_SMALL achieves high efficiency in both training and inference stages.
arXiv Detail & Related papers (2022-06-14T13:22:19Z) - The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss.
Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU.
The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z) - Sample and Computation Redistribution for Efficient Face Detection [137.19388513633484]
Training data sampling and computation distribution strategies are the keys to efficient and accurate face detection.
scrfdf34 outperforms the best competitor, TinaFace, by $3.86%$ (AP at hard set) while being more than emph3$times$ faster on GPUs with VGA-resolution images.
arXiv Detail & Related papers (2021-05-10T23:51:14Z) - Lite-FPN for Keypoint-based Monocular 3D Object Detection [18.03406686769539]
Keypoint-based monocular 3D object detection has made tremendous progress and achieved great speed-accuracy trade-off.
We propose a sort of lightweight feature pyramid network called Lite-FPN to achieve multi-scale feature fusion.
Our proposed method achieves significantly higher accuracy and frame rate at the same time.
arXiv Detail & Related papers (2021-05-01T14:44:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.