PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized
Perception with Neural Sensors
- URL: http://arxiv.org/abs/2304.05440v1
- Date: Tue, 11 Apr 2023 18:16:47 GMT
- Title: PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized
Perception with Neural Sensors
- Authors: Haley M. So, Laurie Bose, Piotr Dudek, and Gordon Wetzstein
- Abstract summary: Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for further processing.
We develop an efficient recurrent neural network architecture, processing PixelRNN, that encodes-temporal features on the sensor using purely binary operations.
PixelRNN reduces the amount data to be transmitted off the sensor by a factor of 64x compared to conventional systems while offering competitive accuracy for hand gesture recognition and lip reading tasks.
- Score: 42.18718773182277
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conventional image sensors digitize high-resolution images at fast frame
rates, producing a large amount of data that needs to be transmitted off the
sensor for further processing. This is challenging for perception systems
operating on edge devices, because communication is power inefficient and
induces latency. Fueled by innovations in stacked image sensor fabrication,
emerging sensor-processors offer programmability and minimal processing
capabilities directly on the sensor. We exploit these capabilities by
developing an efficient recurrent neural network architecture, PixelRNN, that
encodes spatio-temporal features on the sensor using purely binary operations.
PixelRNN reduces the amount of data to be transmitted off the sensor by a
factor of 64x compared to conventional systems while offering competitive
accuracy for hand gesture recognition and lip reading tasks. We experimentally
validate PixelRNN using a prototype implementation on the SCAMP-5
sensor-processor platform.
Related papers
- Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation [20.713880984921385]
Evetac is an event-based optical tactile sensor.
We develop touch processing algorithms to process its measurements online at 1000 Hz.
Evetac's output and the marker tracking provide meaningful features for learning data-driven slip detection and prediction models.
arXiv Detail & Related papers (2023-12-02T22:01:49Z) - Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline [5.8859061623552975]
We present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spiking Convolutional Neural Network (sCNN) computing architecture embedded on a single chip.
By combining both sensor and processing on a single die, we can lower unit production costs significantly.
We present the asynchronous architecture, the individual blocks, and the sCNN processing principle and benchmark against other sCNN capable processors.
arXiv Detail & Related papers (2023-04-13T19:28:57Z) - Object Motion Sensitivity: A Bio-inspired Solution to the Ego-motion
Problem for Event-based Cameras [0.0]
We highlight the capability of the second generation of neuromorphic image sensors, Integrated Retinal Functionality in CMOS Image Sensors (IRIS)
IRIS aims to mimic full retinal computations from photoreceptors to output of the retina for targeted feature-extraction.
Our results show that OMS can accomplish standard computer vision tasks with similar efficiency to conventional RGB and DVS solutions but offers drastic bandwidth reduction.
arXiv Detail & Related papers (2023-03-24T16:22:06Z) - Image sensing with multilayer, nonlinear optical neural networks [4.252754174399026]
An emerging image-sensing paradigm breaks this delineation between data collection and analysis.
By optically encoding images into a compressed, low-dimensional latent space suitable for efficient post-analysis, these image sensors can operate with fewer pixels and fewer photons.
We demonstrate a multilayer ONN pre-processor for image sensing, using a commercial image intensifier as a parallel optoelectronic, optical-to-optical nonlinear activation function.
arXiv Detail & Related papers (2022-07-27T21:00:31Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - CNN-based Omnidirectional Object Detection for HermesBot Autonomous
Delivery Robot with Preliminary Frame Classification [53.56290185900837]
We propose an algorithm for optimizing a neural network for object detection using preliminary binary frame classification.
An autonomous mobile robot with 6 rolling-shutter cameras on the perimeter providing a 360-degree field of view was used as the experimental setup.
arXiv Detail & Related papers (2021-10-22T15:05:37Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Dynamic Neural Representational Decoders for High-Resolution Semantic
Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD)
As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks.
This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z) - AnalogNet: Convolutional Neural Network Inference on Analog Focal Plane
Sensor Processors [0.0]
We present a high-speed, energy-efficient Convolutional Neural Network (CNN) architecture utilising the capabilities of a unique class of devices known as analog Plane Sensor Processors (FPSP)
Unlike traditional vision systems, where the sensor array sends collected data to a separate processor for processing, FPSPs allow data to be processed on the imaging device itself.
Our proposed architecture, coined AnalogNet, reaches a testing accuracy of 96.9% on the MNIST handwritten digits recognition task, at a speed of 2260 FPS, for a cost of 0.7 mJ per frame.
arXiv Detail & Related papers (2020-06-02T16:44:43Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.