Fully Quantized Always-on Face Detector Considering Mobile Image Sensors
- URL: http://arxiv.org/abs/2311.01001v1
- Date: Thu, 2 Nov 2023 05:35:49 GMT
- Title: Fully Quantized Always-on Face Detector Considering Mobile Image Sensors
- Authors: Haechang Lee, Wongi Jeong, Dongil Ryu, Hyunwoo Je, Albert No, Kijeong
Kim, Se Young Chun
- Abstract summary: Current face detectors do not fully meet the requirements for "intelligent" CMOS image sensors integrated with embedded DNNs.
In this study, we aim to bridge the gap by exploring extremely low-bit lightweight face detectors.
- Score: 12.806584794505751
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite significant research on lightweight deep neural networks (DNNs)
designed for edge devices, the current face detectors do not fully meet the
requirements for "intelligent" CMOS image sensors (iCISs) integrated with
embedded DNNs. These sensors are essential in various practical applications,
such as energy-efficient mobile phones and surveillance systems with always-on
capabilities. One noteworthy limitation is the absence of suitable face
detectors for the always-on scenario, a crucial aspect of image sensor-level
applications. These detectors must operate directly with sensor RAW data before
the image signal processor (ISP) takes over. This gap poses a significant
challenge in achieving optimal performance in such scenarios. Further research
and development are necessary to bridge this gap and fully leverage the
potential of iCIS applications. In this study, we aim to bridge the gap by
exploring extremely low-bit lightweight face detectors, focusing on the
always-on face detection scenario for mobile image sensor applications. To
achieve this, our proposed model utilizes sensor-aware synthetic RAW inputs,
simulating always-on face detection processed "before" the ISP chain. Our
approach employs ternary (-1, 0, 1) weights for potential implementations in
image sensors, resulting in a relatively simple network architecture with
shallow layers and extremely low-bitwidth. Our method demonstrates reasonable
face detection performance and excellent efficiency in simulation studies,
offering promising possibilities for practical always-on face detectors in
real-world applications.
Related papers
- MSSIDD: A Benchmark for Multi-Sensor Denoising [55.41612200877861]
We introduce a new benchmark, the Multi-Sensor SIDD dataset, which is the first raw-domain dataset designed to evaluate the sensor transferability of denoising models.
We propose a sensor consistency training framework that enables denoising models to learn the sensor-invariant features.
arXiv Detail & Related papers (2024-11-18T13:32:59Z) - Energy-Efficient & Real-Time Computer Vision with Intelligent Skipping via Reconfigurable CMOS Image Sensors [5.824962833043625]
Video-based computer vision applications typically suffer from high energy consumption due to reading and processing all pixels in a frame, regardless of their significance.
Previous works have attempted to reduce this energy by skipping input patches or pixels and using feedback from the end task to guide the skipping algorithm.
This paper presents a custom-designed CMOS image sensor (CIS) system that improves energy efficiency by selectively skipping uneventful regions or rows within a frame during the sensor's readout phase.
arXiv Detail & Related papers (2024-09-25T20:32:55Z) - Know Thy Neighbors: A Graph Based Approach for Effective Sensor-Based
Human Activity Recognition in Smart Homes [0.0]
We propose a novel graph-guided neural network approach for Human Activity Recognition (HAR) in smart homes.
We accomplish this by learning a more expressive graph structure representing the sensor network in a smart home.
Our approach maps discrete input sensor measurements to a feature space through the application of attention mechanisms.
arXiv Detail & Related papers (2023-11-16T02:43:13Z) - Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a
Light-Weight ToF Sensor [58.305341034419136]
We present the first dense SLAM system with a monocular camera and a light-weight ToF sensor.
We propose a multi-modal implicit scene representation that supports rendering both the signals from the RGB camera and light-weight ToF sensor.
Experiments demonstrate that our system well exploits the signals of light-weight ToF sensors and achieves competitive results.
arXiv Detail & Related papers (2023-08-28T07:56:13Z) - DensePose From WiFi [86.61881052177228]
We develop a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions.
Our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches.
arXiv Detail & Related papers (2022-12-31T16:48:43Z) - GenISP: Neural ISP for Low-Light Machine Cognition [19.444297600977546]
In low-light conditions, object detectors using raw image data are more robust than detectors using image data processed by an ISP pipeline.
We propose a minimal neural ISP pipeline for machine cognition, named GenISP, that explicitly incorporates Color Space Transformation to a device-independent color space.
arXiv Detail & Related papers (2022-05-07T17:17:24Z) - Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge
TPU [58.720142291102135]
In this paper we propose a pose estimation software exploiting neural network architectures.
We show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.
arXiv Detail & Related papers (2022-04-07T08:53:18Z) - Analyzing General-Purpose Deep-Learning Detection and Segmentation
Models with Images from a Lidar as a Camera Sensor [0.06554326244334865]
This work explores the potential of general-purpose DL perception algorithms for processing image-like outputs of advanced lidar sensors.
Rather than processing the three-dimensional point cloud data, this is, to the best of our knowledge, the first work to focus on low-resolution images with 360text field of view.
We show that with adequate preprocessing, general-purpose DL models can process these images, opening the door to their usage in environmental conditions.
arXiv Detail & Related papers (2022-03-08T13:14:43Z) - Bayesian Imitation Learning for End-to-End Mobile Manipulation [80.47771322489422]
Augmenting policies with additional sensor inputs, such as RGB + depth cameras, is a straightforward approach to improving robot perception capabilities.
We show that using the Variational Information Bottleneck to regularize convolutional neural networks improves generalization to held-out domains.
We demonstrate that our method is able to help close the sim-to-real gap and successfully fuse RGB and depth modalities.
arXiv Detail & Related papers (2022-02-15T17:38:30Z) - Enabling energy efficient machine learning on a Ultra-Low-Power vision
sensor for IoT [3.136861161060886]
This paper presents the development, analysis, and embedded implementation of a realtime detection, classification and tracking pipeline.
The power consumption obtained for the inference - which requires 8ms - is 7.5 mW.
arXiv Detail & Related papers (2021-02-02T06:39:36Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.