Q-YOLO: Efficient Inference for Real-time Object Detection
- URL: http://arxiv.org/abs/2307.04816v1
- Date: Sat, 1 Jul 2023 03:50:32 GMT
- Title: Q-YOLO: Efficient Inference for Real-time Object Detection
- Authors: Mingze Wang, Huixin Sun, Jun Shi, Xuhui Liu, Baochang Zhang, Xianbin
Cao
- Abstract summary: Real-time object detection plays a vital role in various computer vision applications.
deploying real-time object detectors on resource-constrained platforms poses challenges due to high computational and memory requirements.
This paper describes a low-bit quantization method to build a highly efficient one-stage detector, dubbed as Q-YOLO.
- Score: 29.51643492051404
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-time object detection plays a vital role in various computer vision
applications. However, deploying real-time object detectors on
resource-constrained platforms poses challenges due to high computational and
memory requirements. This paper describes a low-bit quantization method to
build a highly efficient one-stage detector, dubbed as Q-YOLO, which can
effectively address the performance degradation problem caused by activation
distribution imbalance in traditional quantized YOLO models. Q-YOLO introduces
a fully end-to-end Post-Training Quantization (PTQ) pipeline with a
well-designed Unilateral Histogram-based (UH) activation quantization scheme,
which determines the maximum truncation values through histogram analysis by
minimizing the Mean Squared Error (MSE) quantization errors. Extensive
experiments on the COCO dataset demonstrate the effectiveness of Q-YOLO,
outperforming other PTQ methods while achieving a more favorable balance
between accuracy and computational cost. This research contributes to advancing
the efficient deployment of object detection models on resource-limited edge
devices, enabling real-time detection with reduced computational and memory
overhead.
Related papers
- RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [95.32315448601241]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE)
RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers.
Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z) - HyperDefect-YOLO: Enhance YOLO with HyperGraph Computation for Industrial Defect Detection [12.865603495310328]
HD-YOLO consists of Defect Aware Module (DAM) and Mixed Graph Network (MGNet) in the backbone.
HGANet combines hypergraph and attention mechanism to aggregate multi-scale features.
Cross-Scale Fusion (CSF) is proposed to adaptively fuse and handle features instead of simple concatenation and convolution.
arXiv Detail & Related papers (2024-12-05T08:38:01Z) - P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving [0.0]
Distracted driving is a critical safety issue that leads to numerous fatalities and injuries worldwide.
This study addresses the need for efficient and real-time machine learning models to detect distracted driving behaviors.
A real-time object detection system is introduced, optimized for both speed and accuracy.
arXiv Detail & Related papers (2024-10-21T02:56:44Z) - Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.
We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.
Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - Reducing the Side-Effects of Oscillations in Training of Quantized YOLO
Networks [5.036532914308394]
We show that it is difficult to achieve extremely low precision (4-bit and lower) for efficient YOLO models even with SOTA QAT methods due to oscillation issue.
We propose a simple QAT correction method, namely QC, that takes only a single epoch of training after standard QAT procedure to correct the error.
arXiv Detail & Related papers (2023-11-09T02:53:21Z) - ELUQuant: Event-Level Uncertainty Quantification in Deep Inelastic
Scattering [0.0]
We introduce a physics-informed Bayesian Neural Network (BNN) with flow approximated posteriors for detailed uncertainty quantification (UQ) at the physics event-level.
Applying to Deep Inelastic Scattering (DIS) events, our model effectively extracts the kinematic variables $x$, $Q2$, and $y$.
This detailed description of the underlying uncertainty proves invaluable for decision-making, especially in tasks like event filtering.
arXiv Detail & Related papers (2023-10-04T15:50:05Z) - Drastic Circuit Depth Reductions with Preserved Adversarial Robustness
by Approximate Encoding for Quantum Machine Learning [0.5181797490530444]
We implement methods for the efficient preparation of quantum states representing encoded image data using variational, genetic and matrix product state based algorithms.
Results show that these methods can approximately prepare states to a level suitable for QML using circuits two orders of magnitude shallower than a standard state preparation implementation.
arXiv Detail & Related papers (2023-09-18T01:49:36Z) - Potential and limitations of quantum extreme learning machines [55.41644538483948]
We present a framework to model QRCs and QELMs, showing that they can be concisely described via single effective measurements.
Our analysis paves the way to a more thorough understanding of the capabilities and limitations of both QELMs and QRCs.
arXiv Detail & Related papers (2022-10-03T09:32:28Z) - Towards Balanced Learning for Instance Recognition [149.76724446376977]
We propose Libra R-CNN, a framework towards balanced learning for instance recognition.
It integrates IoU-balanced sampling, balanced feature pyramid, and objective re-weighting, respectively for reducing the imbalance at sample, feature, and objective level.
arXiv Detail & Related papers (2021-08-23T13:40:45Z) - High Dimensional Level Set Estimation with Bayesian Neural Network [58.684954492439424]
This paper proposes novel methods to solve the high dimensional Level Set Estimation problems using Bayesian Neural Networks.
For each problem, we derive the corresponding theoretic information based acquisition function to sample the data points.
Numerical experiments on both synthetic and real-world datasets show that our proposed method can achieve better results compared to existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-17T23:21:53Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.