Neuromorphic spatiotemporal optical flow: Enabling ultrafast visual perception beyond human capabilities
- URL: http://arxiv.org/abs/2409.15345v2
- Date: Thu, 30 Jan 2025 12:20:12 GMT
- Title: Neuromorphic spatiotemporal optical flow: Enabling ultrafast visual perception beyond human capabilities
- Authors: Shengbo Wang, Jingwen Zhao, Tongming Pu, Liangbing Zhao, Xiaoyu Guo, Yue Cheng, Cong Li, Weihao Ma, Chenyu Tang, Zhenyu Xu, Ningli Wang, Luigi Occhipinti, Arokia Nathan, Ravinder Dahiya, Huaqiang Wu, Li Tao, Shuo Gao,
- Abstract summary: We introduce a neuromorphic optical flow approach that addresses delay bottlenecks by encoding temporal information directly in a synaptic transistor array.
Compared to conventional spatial-only optical flow methods, our system offers the spatial-temporal consistency of motion information.
In software benchmarks, our system outperforms state-of-the-art algorithms with a 400% speedup.
- Score: 12.409087198219693
- License:
- Abstract: Optical flow, inspired by the mechanisms of biological visual systems, calculates spatial motion vectors within visual scenes that are necessary for enabling robotics to excel in complex and dynamic working environments. However, current optical flow algorithms, despite human-competitive task performance on benchmark datasets, remain constrained by unacceptable time delays (~0.6 seconds per inference, 4X human processing speed) in practical deployment. Here, we introduce a neuromorphic optical flow approach that addresses delay bottlenecks by encoding temporal information directly in a synaptic transistor array to assist spatial motion analysis. Compared to conventional spatial-only optical flow methods, our spatiotemporal neuromorphic optical flow offers the spatial-temporal consistency of motion information, rapidly identifying regions of interest in as little as 1-2 ms using the temporal motion cues derived from the embedded temporal information in the two-dimensional floating gate synaptic transistors. Thus, the visual input can be selectively filtered to achieve faster velocity calculations and various task execution. At the hardware level, due to the atomically sharp interfaces between distinct functional layers in two-dimensional van der Waals heterostructures, the synaptic transistor offers high-frequency response (~100 {\mu}s), robust non-volatility (>10000 s), and excellent endurance (>8000 cycles), enabling robust visual processing. In software benchmarks, our system outperforms state-of-the-art algorithms with a 400% speedup, frequently surpassing human-level performance while maintaining or enhancing accuracy by utilizing the temporal priors provided by the embedded temporal information.
Related papers
- Neuromorphic Optical Flow and Real-time Implementation with Event
Cameras [47.11134388304464]
We build on the latest developments in event-based vision and spiking neural networks.
We propose a new network architecture that improves the state-of-the-art self-supervised optical flow accuracy.
We demonstrate high speed optical flow prediction with almost two orders of magnitude reduced complexity.
arXiv Detail & Related papers (2023-04-14T14:03:35Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - Time-lapse image classification using a diffractive neural network [0.0]
We show for the first time a time-lapse image classification scheme using a diffractive network.
We show a blind testing accuracy of 62.03% on the optical classification of objects from the CIFAR-10 dataset.
This constitutes the highest inference accuracy achieved so far using a single diffractive network.
arXiv Detail & Related papers (2022-08-23T08:16:30Z) - Motion-aware Memory Network for Fast Video Salient Object Detection [15.967509480432266]
We design a space-time memory (STM)-based network, which extracts useful temporal information of the current frame from adjacent frames as the temporal branch of VSOD.
In the encoding stage, we generate high-level temporal features by using high-level features from the current and its adjacent frames.
In the decoding stage, we propose an effective fusion strategy for spatial and temporal branches.
The proposed model does not require optical flow or other preprocessing, and can reach a speed of nearly 100 FPS during inference.
arXiv Detail & Related papers (2022-08-01T15:56:19Z) - Ultra-low Latency Spiking Neural Networks with Spatio-Temporal
Compression and Synaptic Convolutional Block [4.081968050250324]
Spiking neural networks (SNNs) have neuro-temporal information capability, low processing feature, and high biological plausibility.
Neuro-MNIST, CIFAR10-S, DVS128 gesture datasets need to aggregate individual events into frames with a higher temporal resolution for event stream classification.
We propose a processing-temporal compression method to aggregate individual events into a few time steps of NIST current to reduce the training and inference latency.
arXiv Detail & Related papers (2022-03-18T15:14:13Z) - Adaptive Latent Space Tuning for Non-Stationary Distributions [62.997667081978825]
We present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs.
We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator.
arXiv Detail & Related papers (2021-05-08T03:50:45Z) - Reinforcement Learning with Latent Flow [78.74671595139613]
Flow of Latents for Reinforcement Learning (Flare) is a network architecture for RL that explicitly encodes temporal information through latent vector differences.
We show that Flare recovers optimal performance in state-based RL without explicit access to the state velocity.
We also show that Flare achieves state-of-the-art performance on pixel-based challenging continuous control tasks within the DeepMind control benchmark suite.
arXiv Detail & Related papers (2021-01-06T03:50:50Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - PAN: Towards Fast Action Recognition via Learning Persistence of
Appearance [60.75488333935592]
Most state-of-the-art methods heavily rely on dense optical flow as motion representation.
In this paper, we shed light on fast action recognition by lifting the reliance on optical flow.
We design a novel motion cue called Persistence of Appearance (PA)
In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.
arXiv Detail & Related papers (2020-08-08T07:09:54Z) - A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an
FPGA Implementation [1.2387676601792899]
We present a neuromorphic, bottom-up, dynamic visual saliency model based on the notion of proto-objects.
This model outperforms state-of-the-art dynamic visual saliency models in predicting human eye fixations on a commonly used video dataset.
We introduce a Field-Programmable Gate Array implementation of the model on an Opal Kelly 7350 Kintex-7 board.
arXiv Detail & Related papers (2020-02-27T03:31:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.