Related papers: Ultrafast vision perception by neuromorphic optical flow

Ultrafast vision perception by neuromorphic optical flow

URL: http://arxiv.org/abs/2409.15345v1
Date: Tue, 10 Sep 2024 10:59:32 GMT
Title: Ultrafast vision perception by neuromorphic optical flow
Authors: Shengbo Wang, Shuo Gao, Tongming Pu, Liangbing Zhao, Arokia Nathan,
Abstract summary: 3D neuromorphic optical flow method embeds external motion features directly into hardware. In our demonstration, this approach reduces visual data processing time by an average of 0.3 seconds. Neuromorphic optical flow algorithm's flexibility allows seamless integration with existing algorithms.
Score: 1.1980928503177917
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Optical flow is crucial for robotic visual perception, yet current methods primarily operate in a 2D format, capturing movement velocities only in horizontal and vertical dimensions. This limitation results in incomplete motion cues, such as missing regions of interest or detailed motion analysis of different regions, leading to delays in processing high-volume visual data in real-world settings. Here, we report a 3D neuromorphic optical flow method that leverages the time-domain processing capability of memristors to embed external motion features directly into hardware, thereby completing motion cues and dramatically accelerating the computation of movement velocities and subsequent task-specific algorithms. In our demonstration, this approach reduces visual data processing time by an average of 0.3 seconds while maintaining or improving the accuracy of motion prediction, object tracking, and object segmentation. Interframe visual processing is achieved for the first time in UAV scenarios. Furthermore, the neuromorphic optical flow algorithm's flexibility allows seamless integration with existing algorithms, ensuring broad applicability. These advancements open unprecedented avenues for robotic perception, without the trade-off between accuracy and efficiency.

Related papers

Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics [0.7083699704958353]
We propose a lightweight squeeze-excitation deep learning-based multi stream spatial temporal dynamics time-varying feature extraction approach. The proposed model was tested on the Ninapro DB2, DB4, and DB5 datasets, achieving accuracy rates of 96.41%, 92.40%, and 93.34%, respectively.
arXiv Detail & Related papers (2025-04-04T07:11:12Z)
Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames. It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z)
Neuromorphic Optical Flow and Real-time Implementation with Event Cameras [47.11134388304464]
We build on the latest developments in event-based vision and spiking neural networks. We propose a new network architecture that improves the state-of-the-art self-supervised optical flow accuracy. We demonstrate high speed optical flow prediction with almost two orders of magnitude reduced complexity.
arXiv Detail & Related papers (2023-04-14T14:03:35Z)
GotFlow3D: Recurrent Graph Optimal Transport for Learning 3D Flow Motion in Particle Tracking [11.579751282152841]
Flow visualization technologies such as particle tracking velocimetry (PTV) are broadly used in understanding the all-pervasiveness three-dimensional (3D) turbulent flow from nature and industrial processes. Despite the advances in 3D acquisition techniques, the developed motion estimation algorithms in particle tracking remain great challenges of large particle displacements, dense particle distributions and high computational cost. By introducing a novel deep neural network based on recurrent Graph Optimal Transport, we present an end-to-end solution to learn the 3D fluid flow motion from double-frame particle sets.
arXiv Detail & Related papers (2022-10-31T02:05:58Z)
Correlating sparse sensing for large-scale traffic speed estimation: A Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging. We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z)
Motion-inductive Self-supervised Object Discovery in Videos [99.35664705038728]
We propose a model for processing consecutive RGB frames, and infer the optical flow between any pair of frames using a layered representation. We demonstrate superior performance over previous state-of-the-art methods on three public video segmentation datasets.
arXiv Detail & Related papers (2022-10-01T08:38:28Z)
Time-lapse image classification using a diffractive neural network [0.0]
We show for the first time a time-lapse image classification scheme using a diffractive network. We show a blind testing accuracy of 62.03% on the optical classification of objects from the CIFAR-10 dataset. This constitutes the highest inference accuracy achieved so far using a single diffractive network.
arXiv Detail & Related papers (2022-08-23T08:16:30Z)
Motion-aware Memory Network for Fast Video Salient Object Detection [15.967509480432266]
We design a space-time memory (STM)-based network, which extracts useful temporal information of the current frame from adjacent frames as the temporal branch of VSOD. In the encoding stage, we generate high-level temporal features by using high-level features from the current and its adjacent frames. In the decoding stage, we propose an effective fusion strategy for spatial and temporal branches. The proposed model does not require optical flow or other preprocessing, and can reach a speed of nearly 100 FPS during inference.
arXiv Detail & Related papers (2022-08-01T15:56:19Z)
Ultra-low Latency Spiking Neural Networks with Spatio-Temporal Compression and Synaptic Convolutional Block [4.081968050250324]
Spiking neural networks (SNNs) have neuro-temporal information capability, low processing feature, and high biological plausibility. Neuro-MNIST, CIFAR10-S, DVS128 gesture datasets need to aggregate individual events into frames with a higher temporal resolution for event stream classification. We propose a processing-temporal compression method to aggregate individual events into a few time steps of NIST current to reduce the training and inference latency.
arXiv Detail & Related papers (2022-03-18T15:14:13Z)
EM-driven unsupervised learning for efficient motion segmentation [3.5232234532568376]
This paper presents a CNN-based fully unsupervised method for motion segmentation from optical flow. We use the Expectation-Maximization (EM) framework to leverage the loss function and the training procedure of our motion segmentation neural network. Our method outperforms comparable unsupervised methods and is very efficient.
arXiv Detail & Related papers (2022-01-06T14:35:45Z)
Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video. Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z)
Adaptive Latent Space Tuning for Non-Stationary Distributions [62.997667081978825]
We present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs. We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator.
arXiv Detail & Related papers (2021-05-08T03:50:45Z)
Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field. It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations. Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z)
Reinforcement Learning with Latent Flow [78.74671595139613]
Flow of Latents for Reinforcement Learning (Flare) is a network architecture for RL that explicitly encodes temporal information through latent vector differences. We show that Flare recovers optimal performance in state-based RL without explicit access to the state velocity. We also show that Flare achieves state-of-the-art performance on pixel-based challenging continuous control tasks within the DeepMind control benchmark suite.
arXiv Detail & Related papers (2021-01-06T03:50:50Z)
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised. We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z)
DS-Net: Dynamic Spatiotemporal Network for Video Salient Object Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information. We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z)
PAN: Towards Fast Action Recognition via Learning Persistence of Appearance [60.75488333935592]
Most state-of-the-art methods heavily rely on dense optical flow as motion representation. In this paper, we shed light on fast action recognition by lifting the reliance on optical flow. We design a novel motion cue called Persistence of Appearance (PA) In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.
arXiv Detail & Related papers (2020-08-08T07:09:54Z)
Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition [10.185425416255294]
We propose to use residual frames as an alternative "lightweight" motion representation. We also develop a new pseudo-3D convolution module which decouples 3D convolution into 2D and 1D convolution.
arXiv Detail & Related papers (2020-08-03T17:40:17Z)
End-to-end Learning for Inter-Vehicle Distance and Relative Velocity Estimation in ADAS with a Monocular Camera [81.66569124029313]
We propose a camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network. The key novelty of our method is the integration of multiple visual clues provided by any two time-consecutive monocular frames. We also propose a vehicle-centric sampling mechanism to alleviate the effect of perspective distortion in the motion field.
arXiv Detail & Related papers (2020-06-07T08:18:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.