Related papers: PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized Perception with Neural Sensors

PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized Perception with Neural Sensors

URL: http://arxiv.org/abs/2304.05440v1
Date: Tue, 11 Apr 2023 18:16:47 GMT
Title: PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized Perception with Neural Sensors
Authors: Haley M. So, Laurie Bose, Piotr Dudek, and Gordon Wetzstein
Abstract summary: Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for further processing. We develop an efficient recurrent neural network architecture, processing PixelRNN, that encodes-temporal features on the sensor using purely binary operations. PixelRNN reduces the amount data to be transmitted off the sensor by a factor of 64x compared to conventional systems while offering competitive accuracy for hand gesture recognition and lip reading tasks.
Score: 42.18718773182277
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for further processing. This is challenging for perception systems operating on edge devices, because communication is power inefficient and induces latency. Fueled by innovations in stacked image sensor fabrication, emerging sensor-processors offer programmability and minimal processing capabilities directly on the sensor. We exploit these capabilities by developing an efficient recurrent neural network architecture, PixelRNN, that encodes spatio-temporal features on the sensor using purely binary operations. PixelRNN reduces the amount of data to be transmitted off the sensor by a factor of 64x compared to conventional systems while offering competitive accuracy for hand gesture recognition and lip reading tasks. We experimentally validate PixelRNN using a prototype implementation on the SCAMP-5 sensor-processor platform.

Related papers

MSSIDD: A Benchmark for Multi-Sensor Denoising [55.41612200877861]
We introduce a new benchmark, the Multi-Sensor SIDD dataset, which is the first raw-domain dataset designed to evaluate the sensor transferability of denoising models. We propose a sensor consistency training framework that enables denoising models to learn the sensor-invariant features.
arXiv Detail & Related papers (2024-11-18T13:32:59Z)
Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation [20.713880984921385]
Evetac is an event-based optical tactile sensor. We develop touch processing algorithms to process its measurements online at 1000 Hz. Evetac's output and the marker tracking provide meaningful features for learning data-driven slip detection and prediction models.
arXiv Detail & Related papers (2023-12-02T22:01:49Z)
Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline [5.8859061623552975]
We present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spiking Convolutional Neural Network (sCNN) computing architecture embedded on a single chip. By combining both sensor and processing on a single die, we can lower unit production costs significantly. We present the asynchronous architecture, the individual blocks, and the sCNN processing principle and benchmark against other sCNN capable processors.
arXiv Detail & Related papers (2023-04-13T19:28:57Z)
Object Motion Sensitivity: A Bio-inspired Solution to the Ego-motion Problem for Event-based Cameras [0.0]
We highlight the capability of the second generation of neuromorphic image sensors, Integrated Retinal Functionality in CMOS Image Sensors (IRIS) IRIS aims to mimic full retinal computations from photoreceptors to output of the retina for targeted feature-extraction. Our results show that OMS can accomplish standard computer vision tasks with similar efficiency to conventional RGB and DVS solutions but offers drastic bandwidth reduction.
arXiv Detail & Related papers (2023-03-24T16:22:06Z)
Image sensing with multilayer, nonlinear optical neural networks [4.252754174399026]
An emerging image-sensing paradigm breaks this delineation between data collection and analysis. By optically encoding images into a compressed, low-dimensional latent space suitable for efficient post-analysis, these image sensors can operate with fewer pixels and fewer photons. We demonstrate a multilayer ONN pre-processor for image sensing, using a commercial image intensifier as a parallel optoelectronic, optical-to-optical nonlinear activation function.
arXiv Detail & Related papers (2022-07-27T21:00:31Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
CNN-based Omnidirectional Object Detection for HermesBot Autonomous Delivery Robot with Preliminary Frame Classification [53.56290185900837]
We propose an algorithm for optimizing a neural network for object detection using preliminary binary frame classification. An autonomous mobile robot with 6 rolling-shutter cameras on the perimeter providing a 360-degree field of view was used as the experimental setup.
arXiv Detail & Related papers (2021-10-22T15:05:37Z)
Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues. We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z)
AnalogNet: Convolutional Neural Network Inference on Analog Focal Plane Sensor Processors [0.0]
We present a high-speed, energy-efficient Convolutional Neural Network (CNN) architecture utilising the capabilities of a unique class of devices known as analog Plane Sensor Processors (FPSP) Unlike traditional vision systems, where the sensor array sends collected data to a separate processor for processing, FPSPs allow data to be processed on the imaging device itself. Our proposed architecture, coined AnalogNet, reaches a testing accuracy of 96.9% on the MNIST handwritten digits recognition task, at a speed of 2260 FPS, for a cost of 0.7 mJ per frame.
arXiv Detail & Related papers (2020-06-02T16:44:43Z)
Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras. Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric. By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.