Finding Visual Saliency in Continuous Spike Stream
- URL: http://arxiv.org/abs/2403.06233v1
- Date: Sun, 10 Mar 2024 15:15:35 GMT
- Title: Finding Visual Saliency in Continuous Spike Stream
- Authors: Lin Zhu, Xianzhang Chen, Xiao Wang, Hua Huang
- Abstract summary: In this paper, we investigate the visual saliency in the continuous spike stream for the first time.
We propose a Recurrent Spiking Transformer framework, which is based on a full spiking neural network.
Our framework exhibits a substantial margin of improvement in highlighting and capturing visual saliency in the spike stream.
- Score: 23.591309376586835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a bio-inspired vision sensor, the spike camera emulates the operational
principles of the fovea, a compact retinal region, by employing spike
discharges to encode the accumulation of per-pixel luminance intensity.
Leveraging its high temporal resolution and bio-inspired neuromorphic design,
the spike camera holds significant promise for advancing computer vision
applications. Saliency detection mimics the behavior of human beings and
captures the most salient region from the scenes. In this paper, we investigate
the visual saliency in the continuous spike stream for the first time. To
effectively process the binary spike stream, we propose a Recurrent Spiking
Transformer (RST) framework, which is based on a full spiking neural network.
Our framework enables the extraction of spatio-temporal features from the
continuous spatio-temporal spike stream while maintaining low power
consumption. To facilitate the training and validation of our proposed model,
we build a comprehensive real-world spike-based visual saliency dataset,
enriched with numerous light conditions. Extensive experiments demonstrate the
superior performance of our Recurrent Spiking Transformer framework in
comparison to other spike neural network-based methods. Our framework exhibits
a substantial margin of improvement in capturing and highlighting visual
saliency in the spike stream, which not only provides a new perspective for
spike-based saliency segmentation but also shows a new paradigm for full
SNN-based transformer models. The code and dataset are available at
\url{https://github.com/BIT-Vision/SVS}.
Related papers
- Rethinking High-speed Image Reconstruction Framework with Spike Camera [48.627095354244204]
Spike cameras generate continuous spike streams to capture high-speed scenes with lower bandwidth and higher dynamic range than traditional RGB cameras.
We introduce a novel spike-to-image reconstruction framework SpikeCLIP that goes beyond traditional training paradigms.
Our experiments on real-world low-light datasets demonstrate that SpikeCLIP significantly enhances texture details and the luminance balance of recovered images.
arXiv Detail & Related papers (2025-01-08T13:00:17Z) - High-speed and High-quality Vision Reconstruction of Spike Camera with Spike Stability Theorem [26.827138186323698]
We propose a new spike stability theorem that reveals the relationship between spike stream characteristics and stable light intensity.
Based on the spike stability theorem, two parameter-free algorithms are designed for the real-time vision reconstruction of the spike camera.
Our work provides new theorem and algorithm foundations for the real-time edge-end vision processing of the spike camera.
arXiv Detail & Related papers (2024-12-16T10:33:10Z) - D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video [53.83936023443193]
This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures.
Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
arXiv Detail & Related papers (2024-06-14T14:35:44Z) - SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream [26.165424006344267]
Spike cameras offer distinct advantages over standard cameras.
Existing approaches reliant on spike cameras often assume optimal illumination.
We introduce SpikeNeRF, the first work that derives a NeRF-based volumetric scene representation from spike camera data.
arXiv Detail & Related papers (2024-03-17T13:51:25Z) - SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams [44.02794438687478]
Spike cameras have proven effective in capturing motion features and beneficial for solving this ill-posed problem.
Existing methods fall into the supervised learning paradigm, which suffers from notable performance degradation when applied to real-world scenarios.
We propose the first self-supervised framework for the task of spike-guided motion deblurring.
arXiv Detail & Related papers (2024-03-14T15:29:09Z) - Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - Recurrent Spike-based Image Restoration under General Illumination [21.630646894529065]
Spike camera is a new type of bio-inspired vision sensor that records light intensity in the form of a spike array with high temporal resolution (20,000 Hz)
Existing spike-based approaches typically assume that the scenes are with sufficient light intensity, which is usually unavailable in many real-world scenarios such as rainy days or dusk scenes.
We propose a Recurrent Spike-based Image Restoration (RSIR) network, which is the first work towards restoring clear images from spike arrays under general illumination.
arXiv Detail & Related papers (2023-08-06T04:24:28Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - SCFlow: Optical Flow Estimation for Spiking Camera [50.770803466875364]
Spiking camera has enormous potential in real applications, especially for motion estimation in high-speed scenes.
Optical flow estimation has achieved remarkable success in image-based and event-based vision, but % existing methods cannot be directly applied in spike stream from spiking camera.
This paper presents, SCFlow, a novel deep learning pipeline for optical flow estimation for spiking camera.
arXiv Detail & Related papers (2021-10-08T06:16:45Z) - Cascaded Deep Video Deblurring Using Temporal Sharpness Prior [88.98348546566675]
The proposed algorithm mainly consists of optical flow estimation from intermediate latent frames and latent frame restoration steps.
It first develops a deep CNN model to estimate optical flow from intermediate latent frames and then restores the latent frames based on the estimated optical flow.
We show that exploring the domain knowledge of video deblurring is able to make the deep CNN model more compact and efficient.
arXiv Detail & Related papers (2020-04-06T09:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.