Related papers: Tech Report: One-stage Lightweight Object Detectors

Tech Report: One-stage Lightweight Object Detectors

URL: http://arxiv.org/abs/2210.17151v1
Date: Mon, 31 Oct 2022 09:02:37 GMT
Title: Tech Report: One-stage Lightweight Object Detectors
Authors: Deokki Hong
Abstract summary: This work is for designing one-stage lightweight detectors which perform well in terms of mAP and latency. With baseline models each of which targets on GPU and CPU respectively, various operations are applied instead of the main operations in backbone networks of baseline models.
Score: 0.38073142980733
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work is for designing one-stage lightweight detectors which perform well in terms of mAP and latency. With baseline models each of which targets on GPU and CPU respectively, various operations are applied instead of the main operations in backbone networks of baseline models. In addition to experiments about backbone networks and operations, several feature pyramid network (FPN) architectures are investigated. Benchmarks and proposed detectors are analyzed in terms of the number of parameters, Gflops, GPU latency, CPU latency and mAP, on MS COCO dataset which is a benchmark dataset in object detection. This work propose similar or better network architectures considering the trade-off between accuracy and latency. For example, our proposed GPU-target backbone network outperforms that of YOLOX-tiny which is selected as the benchmark by 1.43x in speed and 0.5 mAP in accuracy on NVIDIA GeForce RTX 2080 Ti GPU.

Related papers

MAPLE-Edge: A Runtime Latency Predictor for Edge Devices [80.01591186546793]
We propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware. Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU performance counters. We also demonstrate that unlike MAPLE which performs best when trained on a pool of devices sharing a common runtime, MAPLE-Edge can effectively generalize across runtimes.
arXiv Detail & Related papers (2022-04-27T14:00:48Z)
EAutoDet: Efficient Architecture Search for Object Detection [110.99532343155073]
EAutoDet framework can discover practical backbone and FPN architectures for object detection in 1.4 GPU-days. We propose a kernel reusing technique by sharing the weights of candidate operations on one edge and consolidating them into one convolution. In particular, the discovered architectures surpass state-of-the-art object detection NAS methods and achieve 40.1 mAP with 120 FPS and 49.2 mAP with 41.3 FPS on COCO test-dev set.
arXiv Detail & Related papers (2022-03-21T05:56:12Z)
Revisiting Efficient Object Detection Backbones from Zero-Shot Neural Architecture Search [34.88658308647129]
In object detection models, the detection backbone consumes more than half of the overall inference cost. We propose a novel zero-shot NAS method to address this issue. The proposed method, named ZenDet, automatically designs efficient detection backbones without training network parameters.
arXiv Detail & Related papers (2021-11-26T07:18:52Z)
Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement. We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment. We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler. We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z)
Pixel Difference Networks for Efficient Edge Detection [71.03915957914532]
We propose a lightweight yet effective architecture named Pixel Difference Network (PiDiNet) for efficient edge detection. Extensive experiments on BSDS500, NYUD, and Multicue datasets are provided to demonstrate its effectiveness. A faster version of PiDiNet with less than 0.1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
arXiv Detail & Related papers (2021-08-16T10:42:59Z)
EEEA-Net: An Early Exit Evolutionary Neural Architecture Search [6.569256728493014]
Search for Convolutional Neural Network (CNN) architectures suitable for an on-device processor with limited computing resources. New algorithm entitled an Early Exit Population Initialisation (EE-PI) for Evolutionary Algorithm (EA) developed. EA-Net achieved the lowest error rate among state-of-the-art NAS models, with 2.46% for CIFAR-10, 15.02% for CIFAR-100, and 23.8% for ImageNet dataset.
arXiv Detail & Related papers (2021-08-13T10:23:19Z)
Oriented R-CNN for Object Detection [61.78746189807462]
This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN. In the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner. The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them.
arXiv Detail & Related papers (2021-08-12T12:47:43Z)
Single Object Tracking through a Fast and Effective Single-Multiple Model Convolutional Neural Network [0.0]
Recent state-of-the-art (SOTA) approaches are proposed based on taking a matching network with a heavy structure to distinguish the target from other objects in the area. In this article, a special architecture is proposed based on which in contrast to the previous approaches, it is possible to identify the object location in a single shot. The presented tracker performs comparatively with the SOTA in challenging situations while having a super speed compared to them (up to $120 FPS$ on 1080ti)
arXiv Detail & Related papers (2021-03-28T11:02:14Z)
LETI: Latency Estimation Tool and Investigation of Neural Networks inference on Mobile GPU [0.0]
In this work, we consider latency approximation on mobile GPU as a data and hardware-specific problem. We build open-source tools which provide a convenient way to conduct massive experiments on different target devices. We experimentally demonstrate the applicability of such an approach on a subset of popular NAS-Benchmark 101 dataset.
arXiv Detail & Related papers (2020-10-06T16:51:35Z)
MobileDets: Searching for Object Detection Architectures for Mobile Accelerators [61.30355783955777]
Inverted bottleneck layers have been the predominant building blocks in state-of-the-art object detection models on mobile devices. Regular convolutions are a potent component to boost the latency-accuracy trade-off for object detection on accelerators. We obtain a family of object detection models, MobileDets, that achieve state-of-the-art results across mobile accelerators.
arXiv Detail & Related papers (2020-04-30T00:21:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.