SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
- URL: http://arxiv.org/abs/2407.15708v2
- Date: Wed, 24 Jul 2024 16:55:08 GMT
- Title: SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
- Authors: Liangyan Jiang, Chuang Zhu, Yanxu Chen,
- Abstract summary: We introduce Swin Spikeformer (SwinSF), a novel model for dynamic scene reconstruction from spike streams.
SwinSF combines shifted window self-attention and proposed temporal spike attention, ensuring a comprehensive feature extraction.
We build a new synthesized dataset for spike image reconstruction which matches the resolution of the latest spike camera.
- Score: 2.609896297570564
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The spike camera, with its high temporal resolution, low latency, and high dynamic range, addresses high-speed imaging challenges like motion blur. It captures photons at each pixel independently, creating binary spike streams rich in temporal information but challenging for image reconstruction. Current algorithms, both traditional and deep learning-based, still need to be improved in the utilization of the rich temporal detail and the restoration of the details of the reconstructed image. To overcome this, we introduce Swin Spikeformer (SwinSF), a novel model for dynamic scene reconstruction from spike streams. SwinSF is composed of Spike Feature Extraction, Spatial-Temporal Feature Extraction, and Final Reconstruction Module. It combines shifted window self-attention and proposed temporal spike attention, ensuring a comprehensive feature extraction that encapsulates both spatial and temporal dynamics, leading to a more robust and accurate reconstruction of spike streams. Furthermore, we build a new synthesized dataset for spike image reconstruction which matches the resolution of the latest spike camera, ensuring its relevance and applicability to the latest developments in spike camera imaging. Experimental results demonstrate that the proposed network SwinSF sets a new benchmark, achieving state-of-the-art performance across a series of datasets, including both real-world and synthesized data across various resolutions. Our codes and proposed dataset will be available soon.
Related papers
- Spike-NeRF: Neural Radiance Field Based On Spike Camera [24.829344089740303]
We propose Spike-NeRF, the first Neural Radiance Field derived from spike data.
Instead of the multi-view images at the same time of NeRF, the inputs of Spike-NeRF are continuous spike streams captured by a moving spike camera in a very short time.
Our results demonstrate that Spike-NeRF produces more visually appealing results than the existing methods and the baseline we proposed in high-speed scenes.
arXiv Detail & Related papers (2024-03-25T04:05:23Z) - SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams [44.02794438687478]
Spike cameras have proven effective in capturing motion features and beneficial for solving this ill-posed problem.
Existing methods fall into the supervised learning paradigm, which suffers from notable performance degradation when applied to real-world scenarios.
We propose the first self-supervised framework for the task of spike-guided motion deblurring.
arXiv Detail & Related papers (2024-03-14T15:29:09Z) - Finding Visual Saliency in Continuous Spike Stream [23.591309376586835]
In this paper, we investigate the visual saliency in the continuous spike stream for the first time.
We propose a Recurrent Spiking Transformer framework, which is based on a full spiking neural network.
Our framework exhibits a substantial margin of improvement in highlighting and capturing visual saliency in the spike stream.
arXiv Detail & Related papers (2024-03-10T15:15:35Z) - Learning to Robustly Reconstruct Low-light Dynamic Scenes from Spike Streams [28.258022350623023]
As a neuromorphic sensor, spike camera can generate continuous binary spike streams to capture per-pixel light intensity.
We propose a bidirectional recurrent-based reconstruction framework, including a Light-Robust Representation (LR-Rep) and a fusion module.
We have developed a reconstruction benchmark for high-speed low-light scenes.
arXiv Detail & Related papers (2024-01-19T03:01:07Z) - ReconFusion: 3D Reconstruction with Diffusion Priors [104.73604630145847]
We present ReconFusion to reconstruct real-world scenes using only a few photos.
Our approach leverages a diffusion prior for novel view synthesis, trained on synthetic and multiview datasets.
Our method synthesizes realistic geometry and texture in underconstrained regions while preserving the appearance of observed regions.
arXiv Detail & Related papers (2023-12-05T18:59:58Z) - Robust e-NeRF: NeRF from Sparse & Noisy Events under Non-Uniform Motion [67.15935067326662]
Event cameras offer low power, low latency, high temporal resolution and high dynamic range.
NeRF is seen as the leading candidate for efficient and effective scene representation.
We propose Robust e-NeRF, a novel method to directly and robustly reconstruct NeRFs from moving event cameras.
arXiv Detail & Related papers (2023-09-15T17:52:08Z) - Recurrent Spike-based Image Restoration under General Illumination [21.630646894529065]
Spike camera is a new type of bio-inspired vision sensor that records light intensity in the form of a spike array with high temporal resolution (20,000 Hz)
Existing spike-based approaches typically assume that the scenes are with sufficient light intensity, which is usually unavailable in many real-world scenarios such as rainy days or dusk scenes.
We propose a Recurrent Spike-based Image Restoration (RSIR) network, which is the first work towards restoring clear images from spike arrays under general illumination.
arXiv Detail & Related papers (2023-08-06T04:24:28Z) - Self-Supervised Scene Dynamic Recovery from Rolling Shutter Images and
Events [63.984927609545856]
Event-based Inter/intra-frame Compensator (E-IC) is proposed to predict the per-pixel dynamic between arbitrary time intervals.
We show that the proposed method achieves state-of-the-art and shows remarkable performance for event-based RS2GS inversion in real-world scenarios.
arXiv Detail & Related papers (2023-04-14T05:30:02Z) - Recovering Continuous Scene Dynamics from A Single Blurry Image with
Events [58.7185835546638]
An Implicit Video Function (IVF) is learned to represent a single motion blurred image with concurrent events.
A dual attention transformer is proposed to efficiently leverage merits from both modalities.
The proposed network is trained only with the supervision of ground-truth images of limited referenced timestamps.
arXiv Detail & Related papers (2023-04-05T18:44:17Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - Spatio-Temporal Recurrent Networks for Event-Based Optical Flow
Estimation [47.984368369734995]
We introduce a novel recurrent encoding-decoding neural network architecture for event-based optical flow estimation.
The network is end-to-end trained with self-supervised learning on the Multi-Vehicle Stereo Event Camera dataset.
We have shown that it outperforms all the existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-09-10T13:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.