PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data
- URL: http://arxiv.org/abs/2407.08272v1
- Date: Thu, 11 Jul 2024 08:17:35 GMT
- Title: PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data
- Authors: Dominika Przewlocka-Rus, Tomasz Kryjak, Marek Gorgon,
- Abstract summary: PowerYOLO is a mixed precision solution to the problem of fitting algorithms of high memory and computational complexity into small low-power devices.
First, we propose a system based on a Dynamic Vision Sensor (DVS), a novel sensor, that offers low power requirements.
Second, to ensure high accuracy and low memory and computational complexity, we propose to use 4-bit width Powers-of-Two (PoT) quantisation.
Third, we replace multiplication with bit-shifting to increase the efficiency of hardware acceleration of such solution.
- Score: 0.5461938536945721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The performance of object detection systems in automotive solutions must be as high as possible, with minimal response time and, due to the often battery-powered operation, low energy consumption. When designing such solutions, we therefore face challenges typical for embedded vision systems: the problem of fitting algorithms of high memory and computational complexity into small low-power devices. In this paper we propose PowerYOLO - a mixed precision solution, which targets three essential elements of such application. First, we propose a system based on a Dynamic Vision Sensor (DVS), a novel sensor, that offers low power requirements and operates well in conditions with variable illumination. It is these features that may make event cameras a preferential choice over frame cameras in some applications. Second, to ensure high accuracy and low memory and computational complexity, we propose to use 4-bit width Powers-of-Two (PoT) quantisation for convolution weights of the YOLO detector, with all other parameters quantised linearly. Finally, we embrace from PoT scheme and replace multiplication with bit-shifting to increase the efficiency of hardware acceleration of such solution, with a special convolution-batch normalisation fusion scheme. The use of specific sensor with PoT quantisation and special batch normalisation fusion leads to a unique system with almost 8x reduction in memory complexity and vast computational simplifications, with relation to a standard approach. This efficient system achieves high accuracy of mAP 0.301 on the GEN1 DVS dataset, marking the new state-of-the-art for such compressed model.
Related papers
- PACE: Pacing Operator Learning to Accurate Optical Field Simulation for Complicated Photonic Devices [14.671301859745453]
Existing SOTA approaches, NeurOLight, struggle with predicting high-fidelity fields for real-world complicated photonic devices.
We propose a novel cross-axis factorized PACE operator with a strong long-distance modeling capacity.
Inspired by human learning, we conquer the simulation task for extremely hard cases into two progressively easy tasks.
arXiv Detail & Related papers (2024-11-05T22:03:14Z) - SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation [74.07836010698801]
We propose an SMPL-based Transformer framework (SMPLer) to address this issue.
SMPLer incorporates two key ingredients: a decoupled attention operation and an SMPL-based target representation.
Extensive experiments demonstrate the effectiveness of SMPLer against existing 3D human shape and pose estimation methods.
arXiv Detail & Related papers (2024-04-23T17:59:59Z) - LEMDA: A Novel Feature Engineering Method for Intrusion Detection in IoT Systems [3.5323691899538137]
Intrusion detection systems (IDS) for the Internet of Things (IoT) systems can use AI-based models to ensure secure communications.
Complex models have notorious problems such as overfitting, low interpretability, and high computational complexity.
This paper proposes a new feature engineering method called LEMDA (Light feature Engineering based on the Mean Decrease in Accuracy)
arXiv Detail & Related papers (2024-04-20T11:11:47Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Match and Locate: low-frequency monocular odometry based on deep feature
matching [0.65268245109828]
We introduce a novel approach for the robotic odometry which only requires a single camera.
The approach is based on matching image features between the consecutive frames of the video stream using deep feature matching models.
We evaluate the performance of the approach in the AISG-SLA Visual Localisation Challenge and find that while being computationally efficient and easy to implement our method shows competitive results.
arXiv Detail & Related papers (2023-11-16T17:32:58Z) - M3ICRO: Machine Learning-Enabled Compact Photonic Tensor Core based on
PRogrammable Multi-Operand Multimode Interference [18.0155410476884]
Photonic tensor core (PTC) designs based on standard optical components hinder scalability and compute density due to their large spatial footprint.
We propose an ultra-compact PTC using customized programmable multi-operand multimode interference (MOMMI) devices, named M3ICRO.
M3ICRO achieves a 3.4-9.6x smaller footprint, 1.6-4.4x higher speed, 10.6-42x higher compute density, 3.7-12x higher system throughput, and superior noise robustness.
arXiv Detail & Related papers (2023-05-31T02:34:36Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Improved Transformer for High-Resolution GANs [69.42469272015481]
We introduce two key ingredients to Transformer to address this challenge.
We show in the experiments that the proposed HiT achieves state-of-the-art FID scores of 31.87 and 2.95 on unconditional ImageNet $128 times 128$ and FFHQ $256 times 256$, respectively.
arXiv Detail & Related papers (2021-06-14T17:39:49Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - MIMC-VINS: A Versatile and Resilient Multi-IMU Multi-Camera
Visual-Inertial Navigation System [44.76768683036822]
We propose a real-time consistent multi-IMU multi-camera (CMU)-VINS estimator for visual-inertial navigation systems.
Within an efficient multi-state constraint filter, the proposed MIMC-VINS algorithm optimally fuses asynchronous measurements from all sensors.
The proposed MIMC-VINS is validated in both Monte-Carlo simulations and real-world experiments.
arXiv Detail & Related papers (2020-06-28T20:16:08Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.