Streaming quanta sensors for online, high-performance imaging and vision
- URL: http://arxiv.org/abs/2406.00859v1
- Date: Sun, 2 Jun 2024 20:30:49 GMT
- Title: Streaming quanta sensors for online, high-performance imaging and vision
- Authors: Tianyi Zhang, Matthew Dutson, Vivek Boominathan, Mohit Gupta, Ashok Veeraraghavan,
- Abstract summary: quanta image sensors (QIS) have demonstrated remarkable imaging capabilities in many challenging scenarios.
Despite their potential, the adoption of these sensors is severely hampered by (a) high data rates and (b) the need for new computational pipelines to handle the unconventional raw data.
We introduce a simple, low-bandwidth computational pipeline to address these challenges.
Our approach results in significant data bandwidth reductions 100X and real-time image reconstruction and computer vision.
- Score: 34.098174669870126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently quanta image sensors (QIS) -- ultra-fast, zero-read-noise binary image sensors -- have demonstrated remarkable imaging capabilities in many challenging scenarios. Despite their potential, the adoption of these sensors is severely hampered by (a) high data rates and (b) the need for new computational pipelines to handle the unconventional raw data. We introduce a simple, low-bandwidth computational pipeline to address these challenges. Our approach is based on a novel streaming representation with a small memory footprint, efficiently capturing intensity information at multiple temporal scales. Updating the representation requires only 16 floating-point operations/pixel, which can be efficiently computed online at the native frame rate of the binary frames. We use a neural network operating on this representation to reconstruct videos in real-time (10-30 fps). We illustrate why such representation is well-suited for these emerging sensors, and how it offers low latency and high frame rate while retaining flexibility for downstream computer vision. Our approach results in significant data bandwidth reductions ~100X and real-time image reconstruction and computer vision -- $10^4$-$10^5$ reduction in computation than existing state-of-the-art approach while maintaining comparable quality. To the best of our knowledge, our approach is the first to achieve online, real-time image reconstruction on QIS.
Related papers
- bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction [57.199618102578576]
We propose bit2bit, a new method for reconstructing high-quality image stacks at original resolution from sparse binary quantatemporal image data.
Inspired by recent work on Poisson denoising, we developed an algorithm that creates a dense image sequence from sparse binary photon data.
We present a novel dataset containing a wide range of real SPAD high-speed videos under various challenging imaging conditions.
arXiv Detail & Related papers (2024-10-30T17:30:35Z) - Characterization of point-source transient events with a rolling-shutter compressed sensing system [0.0]
Point-source transient events (PSTEs) pose several challenges to an imaging system.
Traditional imaging systems that meet these requirements are costly in terms of price, size, weight, power consumption, and data bandwidth.
We develop a novel compressed sensing algorithm adapted to the rolling shutter readout of an imaging system.
arXiv Detail & Related papers (2024-08-29T19:22:37Z) - Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - EDeNN: Event Decay Neural Networks for low latency vision [26.784944204163363]
We develop a new type of neural network which operates closer to the original event data stream.
We demonstrate state-of-the-art performance in angular velocity regression and competitive optical flow estimation.
arXiv Detail & Related papers (2022-09-09T15:51:39Z) - Dual-view Snapshot Compressive Imaging via Optical Flow Aided Recurrent
Neural Network [14.796204921975733]
Dual-view snapshot compressive imaging (SCI) aims to capture videos from two field-of-views (FoVs) in a single snapshot.
It is challenging for existing model-based decoding algorithms to reconstruct each individual scene.
We propose an optical flow-aided recurrent neural network for dual video SCI systems, which provides high-quality decoding in seconds.
arXiv Detail & Related papers (2021-09-11T14:24:44Z) - Learning optical flow from still images [53.295332513139925]
We introduce a framework to generate accurate ground-truth optical flow annotations quickly and in large amounts from any readily available single real picture.
We virtually move the camera in the reconstructed environment with known motion vectors and rotation angles.
When trained with our data, state-of-the-art optical flow networks achieve superior generalization to unseen real data.
arXiv Detail & Related papers (2021-04-08T17:59:58Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade.
CNNs are capable of learning robust representations of the data directly from the RGB pixels.
Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z) - 11 TeraFLOPs per second photonic convolutional accelerator for deep
learning optical neural networks [0.0]
We demonstrate a universal optical vector convolutional accelerator operating beyond 10 TeraFLOPS (floating point operations per second)
We then use the same hardware to sequentially form a deep optical CNN with ten output neurons, achieving successful recognition of full 10 digits with 900 pixel handwritten digit images with 88% accuracy.
This approach is scalable and trainable to much more complex networks for demanding applications such as unmanned vehicle and real-time video recognition.
arXiv Detail & Related papers (2020-11-14T21:24:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.