HW-SW Optimization of DNNs for Privacy-preserving People Counting on
Low-resolution Infrared Arrays
- URL: http://arxiv.org/abs/2402.01226v1
- Date: Fri, 2 Feb 2024 08:45:38 GMT
- Title: HW-SW Optimization of DNNs for Privacy-preserving People Counting on
Low-resolution Infrared Arrays
- Authors: Matteo Risso, Chen Xie, Francesco Daghero, Alessio Burrello,
Seyedmorteza Mollaei, Marco Castellano, Enrico Macii, Massimo Poncino,
Daniele Jahier Pagliari
- Abstract summary: Low-resolution infrared (IR) array sensors enable people counting applications such as monitoring the occupancy of spaces and people flows.
Deep Neural Networks (DNNs) have been shown to be well-suited to process these sensor data in an accurate and efficient manner.
We propose a highly automated full-stack optimization flow for DNNs that goes from neural architecture search, mixed-precision quantization, and post-processing.
- Score: 9.806742394395322
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-resolution infrared (IR) array sensors enable people counting
applications such as monitoring the occupancy of spaces and people flows while
preserving privacy and minimizing energy consumption. Deep Neural Networks
(DNNs) have been shown to be well-suited to process these sensor data in an
accurate and efficient manner. Nevertheless, the space of DNNs' architectures
is huge and its manual exploration is burdensome and often leads to sub-optimal
solutions. To overcome this problem, in this work, we propose a highly
automated full-stack optimization flow for DNNs that goes from neural
architecture search, mixed-precision quantization, and post-processing, down to
the realization of a new smart sensor prototype, including a Microcontroller
with a customized instruction set. Integrating these cross-layer optimizations,
we obtain a large set of Pareto-optimal solutions in the 3D-space of energy,
memory, and accuracy. Deploying such solutions on our hardware platform, we
improve the state-of-the-art achieving up to 4.2x model size reduction, 23.8x
code size reduction, and 15.38x energy reduction at iso-accuracy.
Related papers
- LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient.
We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z) - PLiNIO: A User-Friendly Library of Gradient-based Methods for
Complexity-aware DNN Optimization [3.460496851517031]
PLiNIO is an open-source library implementing a comprehensive set of state-of-the-art DNN design automation techniques.
We show that PLiNIO achieves up to 94.34% memory reduction for a 1% accuracy drop compared to a baseline architecture.
arXiv Detail & Related papers (2023-07-18T07:11:14Z) - Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline [5.8859061623552975]
We present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spiking Convolutional Neural Network (sCNN) computing architecture embedded on a single chip.
By combining both sensor and processing on a single die, we can lower unit production costs significantly.
We present the asynchronous architecture, the individual blocks, and the sCNN processing principle and benchmark against other sCNN capable processors.
arXiv Detail & Related papers (2023-04-13T19:28:57Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - Energy-efficient and Privacy-aware Social Distance Monitoring with
Low-resolution Infrared Sensors and Adaptive Inference [4.158182639870093]
Low-resolution infrared (IR) sensors can be leveraged to implement privacy-preserving social distance monitoring solutions in indoor spaces.
We propose an energy-efficient adaptive inference solution consisting of a cascade of a simple wake-up trigger and a 8-bit quantized Convolutional Neural Network (CNN)
We show that, when processing the output of a 8x8 low-resolution IR sensor, we are able to reduce the energy consumption by 37-57% with respect to a static CNN-based approach.
arXiv Detail & Related papers (2022-04-22T07:07:38Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Automated Design Space Exploration for optimised Deployment of DNN on
Arm Cortex-A CPUs [13.628734116014819]
Deep learning on embedded devices has prompted the development of numerous methods to optimise the deployment of deep neural networks (DNN)
There is a lack of research on cross-level optimisation as the space of approaches becomes too large to test and obtain a globally optimised solution.
We present a set of results for state-of-the-art DNNs on a range of Arm Cortex-A CPU platforms achieving up to 4x improvement in performance and over 2x reduction in memory.
arXiv Detail & Related papers (2020-06-09T11:00:06Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.