Improving Post-Training Quantization on Object Detection with Task
Loss-Guided Lp Metric
- URL: http://arxiv.org/abs/2304.09785v3
- Date: Sun, 7 May 2023 16:16:13 GMT
- Title: Improving Post-Training Quantization on Object Detection with Task
Loss-Guided Lp Metric
- Authors: Lin Niu, Jiawei Liu, Zhihang Yuan, Dawei Yang, Xinggang Wang, Wenyu
Liu
- Abstract summary: Post-Training Quantization (PTQ) transforms a full-precision model into low bit-width directly.
PTQ suffers severe accuracy drop when applied to complex tasks such as object detection.
DetPTQ employs the ODOL-based adaptive Lp metric to select the optimal quantization parameters.
- Score: 43.81334288840746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient inference for object detection networks is a major challenge on
edge devices. Post-Training Quantization (PTQ), which transforms a
full-precision model into low bit-width directly, is an effective and
convenient approach to reduce model inference complexity. But it suffers severe
accuracy drop when applied to complex tasks such as object detection. PTQ
optimizes the quantization parameters by different metrics to minimize the
perturbation of quantization. The p-norm distance of feature maps before and
after quantization, Lp, is widely used as the metric to evaluate perturbation.
For the specialty of object detection network, we observe that the parameter p
in Lp metric will significantly influence its quantization performance. We
indicate that using a fixed hyper-parameter p does not achieve optimal
quantization performance. To mitigate this problem, we propose a framework,
DetPTQ, to assign different p values for quantizing different layers using an
Object Detection Output Loss (ODOL), which represents the task loss of object
detection. DetPTQ employs the ODOL-based adaptive Lp metric to select the
optimal quantization parameters. Experiments show that our DetPTQ outperforms
the state-of-the-art PTQ methods by a significant margin on both 2D and 3D
object detectors. For example, we achieve
31.1/31.7(quantization/full-precision) mAP on RetinaNet-ResNet18 with 4-bit
weight and 4-bit activation.
Related papers
- Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.
We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.
Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object
Detection [35.35457515189062]
Post-Training Quantization (PTQ) has been widely adopted in 2D vision tasks.
LiDAR-PTQ can achieve state-of-the-art quantization performance when applied to CenterPoint.
LiDAR-PTQ is cost-effective being $30times$ faster than the quantization-aware training method.
arXiv Detail & Related papers (2024-01-29T03:35:55Z) - On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks [52.97107229149988]
We propose an On-Chip Hardware-Aware Quantization framework, performing hardware-aware mixed-precision quantization on deployed edge devices.
For efficiency metrics, we built an On-Chip Quantization Aware pipeline, which allows the quantization process to perceive the actual hardware efficiency of the quantization operator.
For accuracy metrics, we propose Mask-Guided Quantization Estimation technology to effectively estimate the accuracy impact of operators in the on-chip scenario.
arXiv Detail & Related papers (2023-09-05T04:39:34Z) - ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection [55.291579862817656]
Existing oriented object detection methods commonly use metric AP$_50$ to measure the performance of the model.
We argue that AP$_50$ is inherently unsuitable for oriented object detection due to its large tolerance in angle deviation.
We propose an Aspect Ratio Sensitive Oriented Object Detector with Transformer, termed ARS-DETR, which exhibits a competitive performance.
arXiv Detail & Related papers (2023-03-09T02:20:56Z) - PD-Quant: Post-Training Quantization based on Prediction Difference
Metric [43.81334288840746]
Post-training quantization (PTQ) is a neural network compression technique that converts a full-precision model into a quantized model using lower-precision data types.
How to determine the appropriate quantization parameters is the main problem facing now.
PD-Quant is a method that addresses this limitation by considering global information.
arXiv Detail & Related papers (2022-12-14T05:48:58Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z) - n-hot: Efficient bit-level sparsity for powers-of-two neural network
quantization [0.0]
Powers-of-two (PoT) quantization reduces the number of bit operations of deep neural networks on resource-constrained hardware.
PoT quantization triggers a severe accuracy drop because of its limited representation ability.
We propose an efficient PoT quantization scheme that balances accuracy and costs in a memory-efficient way.
arXiv Detail & Related papers (2021-03-22T10:13:12Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Optimisation of the PointPillars network for 3D object detection in
point clouds [1.1470070927586016]
In this paper we present our research on the optimisation of a deep neural network for 3D object detection in a point cloud.
We performed the experiments for the PointPillars network, which offers a reasonable compromise between detection accuracy and calculation complexity.
This will allow for real-time LiDAR data processing with low energy consumption.
arXiv Detail & Related papers (2020-07-01T13:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.