Related papers: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction

IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction

URL: http://arxiv.org/abs/2512.05240v1
Date: Thu, 04 Dec 2025 20:37:45 GMT
Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction
Authors: Dmitrii Torbunov, Onur Okuducu, Yi Huang, Odera Dim, Rebecca Coles, Yonggang Cui, Yihui Ren,
Abstract summary: Event cameras offer sparse, motion-driven sensing with low power consumption.<n>We propose a hybrid capture paradigm that records sparse RGB- sequences alongside continuous event streams.<n>We reconstruct full RGB video offline -- reducing capture power consumption for downstream applications.
Score: 4.452083769109418
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continuous video monitoring in surveillance, robotics, and wearable systems faces a fundamental power constraint: conventional RGB cameras consume substantial energy through fixed-rate capture. Event cameras offer sparse, motion-driven sensing with low power consumption, but produce asynchronous event streams rather than RGB video. We propose a hybrid capture paradigm that records sparse RGB keyframes alongside continuous event streams, then reconstructs full RGB video offline -- reducing capture power consumption while maintaining standard video output for downstream applications. We introduce the Image and Event to Video (IE2Video) task: reconstructing RGB video sequences from a single initial frame and subsequent event camera data. We investigate two architectural strategies: adapting an autoregressive model (HyperE2VID) for RGB generation, and injecting event representations into a pretrained text-to-video diffusion model (LTX) via learned encoders and low-rank adaptation. Our experiments demonstrate that the diffusion-based approach achieves 33\% better perceptual quality than the autoregressive baseline (0.283 vs 0.422 LPIPS). We validate our approach across three event camera datasets (BS-ERGB, HS-ERGB far/close) at varying sequence lengths (32-128 frames), demonstrating robust cross-dataset generalization with strong performance on unseen capture configurations.

Related papers

UniE2F: A Unified Diffusion Framework for Event-to-Frame Reconstruction with Video Foundation Models [67.24086328473437]
Event cameras excel at recording relative intensity changes rather than absolute intensity.<n>The resulting data streams suffer from a significant loss of spatial information and static texture details.<n>We address this limitation by leveraging a pre-trained video diffusion model to reconstruct high-fidelity video frames from sparse event data.
arXiv Detail & Related papers (2026-02-22T14:06:49Z)
EvDiff: High Quality Video with an Event Camera [77.07279880903009]
Reconstructing intensity images from events is a highly ill-posed task due to the inherent ambiguity of absolute brightness.<n>We propose EvDiff, an event-based diffusion model that follows a surrogate training framework to produce high-quality videos.
arXiv Detail & Related papers (2025-11-21T18:49:18Z)
DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting [59.91048302471001]
We present DiET-GS, a diffusion prior and event stream-assisted motion deblurring 3DGS.<n>Our framework effectively leverages both blur-free event streams and diffusion prior in a two-stage training strategy.
arXiv Detail & Related papers (2025-03-31T15:27:07Z)
Dynamic EventNeRF: Reconstructing General Dynamic Scenes from Multi-view RGB and Event Streams [69.65147723239153]
Volumetric reconstruction of dynamic scenes is an important problem in computer vision.<n>It is especially challenging in poor lighting and with fast motion.<n>We propose the first method totemporally reconstruct a scene from sparse multi-view event streams and sparse RGB frames.
arXiv Detail & Related papers (2024-12-09T18:56:18Z)
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [87.1077910795879]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution.<n>We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS.<n>We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z)
ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection [51.16181295385818]
We first collect an annotated RGB-D video SODOD (DSOD-100) dataset, which contains 100 videos within a total of 9,362 frames. All the frames in each video are manually annotated to a high-quality saliency annotation. We propose a new baseline model, named attentive triple-fusion network (ATF-Net) for RGB-D salient object detection.
arXiv Detail & Related papers (2024-06-18T12:09:43Z)
Event-based Continuous Color Video Decompression from Single Frames [36.4263932473053]
We present ContinuityCam, a novel approach to generate a continuous video from a single static RGB image and an event camera stream.<n>Our approach combines continuous long-range motion modeling with a neural synthesis model, enabling frame prediction at arbitrary times within the events.
arXiv Detail & Related papers (2023-11-30T18:59:23Z)
SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition [40.107228252231515]
We propose to recognize patterns by fusing RGB frames and event streams simultaneously.<n>Due to the scarce of RGB-Event based classification dataset, we also propose a large-scale PokerEvent dataset.
arXiv Detail & Related papers (2023-08-08T16:15:35Z)
HyperE2VID: Improving Event-Based Video Reconstruction via Hypernetworks [16.432164340779266]
We propose HyperE2VID, a dynamic neural network architecture for event-based video reconstruction. Our approach uses hypernetworks to generate per-pixel adaptive filters guided by a context fusion module.
arXiv Detail & Related papers (2023-05-10T18:00:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.