Data-Driven Pixel Control: Challenges and Prospects
- URL: http://arxiv.org/abs/2408.04767v1
- Date: Thu, 8 Aug 2024 21:49:19 GMT
- Title: Data-Driven Pixel Control: Challenges and Prospects
- Authors: Saurabh Farkya, Zachary Alan Daniels, Aswin Raghavan, Gooitzen van der Wal, Michael Isnardi, Michael Piacentino, David Zhang,
- Abstract summary: We study a data-driven system that combines dynamic sensing at the pixel level with computer vision analytics at the video level.
Our system achieves a 10X reduction in bandwidth and a 15-30X improvement in Energy-Delay Product (EDP) when activating only 30% of pixels.
- Score: 13.158333009169418
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advancements in sensors have led to high resolution and high data throughput at the pixel level. Simultaneously, the adoption of increasingly large (deep) neural networks (NNs) has lead to significant progress in computer vision. Currently, visual intelligence comes at increasingly high computational complexity, energy, and latency. We study a data-driven system that combines dynamic sensing at the pixel level with computer vision analytics at the video level and propose a feedback control loop to minimize data movement between the sensor front-end and computational back-end without compromising detection and tracking precision. Our contributions are threefold: (1) We introduce anticipatory attention and show that it leads to high precision prediction with sparse activation of pixels; (2) Leveraging the feedback control, we show that the dimensionality of learned feature vectors can be significantly reduced with increased sparsity; and (3) We emulate analog design choices (such as varying RGB or Bayer pixel format and analog noise) and study their impact on the key metrics of the data-driven system. Comparative analysis with traditional pixel and deep learning models shows significant performance enhancements. Our system achieves a 10X reduction in bandwidth and a 15-30X improvement in Energy-Delay Product (EDP) when activating only 30% of pixels, with a minor reduction in object detection and tracking precision. Based on analog emulation, our system can achieve a throughput of 205 megapixels/sec (MP/s) with a power consumption of only 110 mW per MP, i.e., a theoretical improvement of ~30X in EDP.
Related papers
- Energy-Efficient & Real-Time Computer Vision with Intelligent Skipping via Reconfigurable CMOS Image Sensors [5.824962833043625]
Video-based computer vision applications typically suffer from high energy consumption due to reading and processing all pixels in a frame, regardless of their significance.
Previous works have attempted to reduce this energy by skipping input patches or pixels and using feedback from the end task to guide the skipping algorithm.
This paper presents a custom-designed CMOS image sensor (CIS) system that improves energy efficiency by selectively skipping uneventful regions or rows within a frame during the sensor's readout phase.
arXiv Detail & Related papers (2024-09-25T20:32:55Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Agile gesture recognition for capacitive sensing devices: adapting
on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller.
The controller generates real-time signals from each of the wearer five fingers.
We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z) - Vicinity Vision Transformer [53.43198716947792]
We present a Vicinity Attention that introduces a locality bias to vision transformers with linear complexity.
Our approach achieves state-of-the-art image classification accuracy with 50% fewer parameters than previous methods.
arXiv Detail & Related papers (2022-06-21T17:33:53Z) - Single-Shot Optical Neural Network [55.41644538483948]
'Weight-stationary' analog optical and electronic hardware has been proposed to reduce the compute resources required by deep neural networks.
We present a scalable, single-shot-per-layer weight-stationary optical processor.
arXiv Detail & Related papers (2022-05-18T17:49:49Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z) - Toward Efficient Hyperspectral Image Processing inside Camera Pixels [1.6449390849183356]
Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands.
To mitigate this problem, we propose a form of processing-in-pixel (PIP)
Our PIP-optimized custom CNN layers effectively compress the input data, significantly reducing the bandwidth required to transmit the data downstream to the HSI processing unit.
arXiv Detail & Related papers (2022-03-11T01:06:02Z) - Analyzing General-Purpose Deep-Learning Detection and Segmentation
Models with Images from a Lidar as a Camera Sensor [0.06554326244334865]
This work explores the potential of general-purpose DL perception algorithms for processing image-like outputs of advanced lidar sensors.
Rather than processing the three-dimensional point cloud data, this is, to the best of our knowledge, the first work to focus on low-resolution images with 360text field of view.
We show that with adequate preprocessing, general-purpose DL models can process these images, opening the door to their usage in environmental conditions.
arXiv Detail & Related papers (2022-03-08T13:14:43Z) - An Ultra Fast Low Power Convolutional Neural Network Image Sensor with
Pixel-level Computing [3.41234610095684]
This paper proposes a Processing-In-Pixel (PIP) CMOS sensor architecture, which allows convolution operation before the column readout circuit to significantly improve the image reading speed.
In other words, the computational efficiency is 4.75 TOPS/w, which is about 3.6 times as higher as the state-of-the-art.
arXiv Detail & Related papers (2021-01-09T07:10:03Z) - Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays [16.531637803429277]
We present a novel method of CNN inference for pixel processor array ( PPA) vision sensors.
Our approach can perform convolutional layers, max pooling, ReLu, and a final fully connected layer entirely upon the PPA sensor.
This is the first work demonstrating CNN inference conducted entirely upon the processor array of a PPA vision sensor device, requiring no external processing.
arXiv Detail & Related papers (2020-04-27T01:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.