TimeRewind: Rewinding Time with Image-and-Events Video Diffusion
- URL: http://arxiv.org/abs/2403.13800v1
- Date: Wed, 20 Mar 2024 17:57:02 GMT
- Title: TimeRewind: Rewinding Time with Image-and-Events Video Diffusion
- Authors: Jingxi Chen, Brandon Y. Feng, Haoming Cai, Mingyang Xie, Christopher Metzler, Cornelia Fermuller, Yiannis Aloimonos,
- Abstract summary: This paper addresses the novel challenge of rewinding'' time from a single captured image to recover the fleeting moments missed just before the shutter button is pressed.
We overcome this challenge by leveraging the emerging technology of neuromorphic event cameras, which capture motion information with high temporal resolution.
Our proposed framework introduces an event motion adaptor conditioned on event camera data, guiding the diffusion model to generate videos that are visually coherent and physically grounded in the captured events.
- Score: 10.687722181495065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the novel challenge of ``rewinding'' time from a single captured image to recover the fleeting moments missed just before the shutter button is pressed. This problem poses a significant challenge in computer vision and computational photography, as it requires predicting plausible pre-capture motion from a single static frame, an inherently ill-posed task due to the high degree of freedom in potential pixel movements. We overcome this challenge by leveraging the emerging technology of neuromorphic event cameras, which capture motion information with high temporal resolution, and integrating this data with advanced image-to-video diffusion models. Our proposed framework introduces an event motion adaptor conditioned on event camera data, guiding the diffusion model to generate videos that are visually coherent and physically grounded in the captured events. Through extensive experimentation, we demonstrate the capability of our approach to synthesize high-quality videos that effectively ``rewind'' time, showcasing the potential of combining event camera technology with generative models. Our work opens new avenues for research at the intersection of computer vision, computational photography, and generative modeling, offering a forward-thinking solution to capturing missed moments and enhancing future consumer cameras and smartphones. Please see the project page at https://timerewind.github.io/ for video results and code release.
Related papers
- Investigating Event-Based Cameras for Video Frame Interpolation in Sports [59.755469098797406]
We present a first investigation of event-based Video Frame Interpolation (VFI) models for generating sports slow-motion videos.
Particularly, we design and implement a bi-camera recording setup, including an RGB and an event-based camera to capture sports videos, to temporally align and spatially register both cameras.
Our experimental validation demonstrates that TimeLens, an off-the-shelf event-based VFI model, can effectively generate slow-motion footage for sports videos.
arXiv Detail & Related papers (2024-07-02T15:39:08Z) - Event-based Continuous Color Video Decompression from Single Frames [38.59798259847563]
We present ContinuityCam, a novel approach to generate a continuous video from a single static RGB image, using an event camera.
Our approach combines continuous long-range motion modeling with a feature-plane-based neural integration model, enabling frame prediction at arbitrary times within the events.
arXiv Detail & Related papers (2023-11-30T18:59:23Z) - EGVD: Event-Guided Video Deraining [57.59935209162314]
We propose an end-to-end learning-based network to unlock the potential of the event camera for video deraining.
We build a real-world dataset consisting of rainy videos and temporally synchronized event streams.
arXiv Detail & Related papers (2023-09-29T13:47:53Z) - Pedestrian detection with high-resolution event camera [0.0]
Event cameras (DVS) are a potentially interesting technology to address the above mentioned problems.
In this paper, we compare two methods of processing event data by means of deep learning for the task of pedestrian detection.
We used a representation in the form of video frames, convolutional neural networks and asynchronous sparse convolutional neural networks.
arXiv Detail & Related papers (2023-05-29T10:57:59Z) - TimeReplayer: Unlocking the Potential of Event Cameras for Video
Interpolation [78.99283105497489]
Event camera is a new device to enable video at the presence of arbitrarily complex motion.
This paper proposes a novel TimeReplayer algorithm to interpolate videos captured by commodity cameras with events.
arXiv Detail & Related papers (2022-03-25T18:57:42Z) - Event-guided Deblurring of Unknown Exposure Time Videos [31.992673443516235]
Event cameras can capture apparent motion with a high temporal resolution.
We propose a novel Exposure Time-based Event Selection module to selectively use event features.
Our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-12-13T19:46:17Z) - MEFNet: Multi-scale Event Fusion Network for Motion Deblurring [62.60878284671317]
Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times.
As a kind of bio-inspired camera, the event camera records the intensity changes in an asynchronous way with high temporal resolution.
In this paper, we rethink the event-based image deblurring problem and unfold it into an end-to-end two-stage image restoration network.
arXiv Detail & Related papers (2021-11-30T23:18:35Z) - EventHands: Real-Time Neural 3D Hand Reconstruction from an Event Stream [80.15360180192175]
3D hand pose estimation from monocular videos is a long-standing and challenging problem.
We address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes.
Our approach has characteristics previously not demonstrated with a single RGB or depth camera.
arXiv Detail & Related papers (2020-12-11T16:45:34Z) - 4D Visualization of Dynamic Events from Unconstrained Multi-View Videos [77.48430951972928]
We present a data-driven approach for 4D space-time visualization of dynamic events from videos captured by hand-held multiple cameras.
Key to our approach is the use of self-supervised neural networks specific to the scene to compose static and dynamic aspects of an event.
This model allows us to create virtual cameras that facilitate: (1) freezing the time and exploring views; (2) freezing a view and moving through time; and (3) simultaneously changing both time and view.
arXiv Detail & Related papers (2020-05-27T17:57:19Z) - Learning to Deblur and Generate High Frame Rate Video with an Event
Camera [0.0]
Event cameras do not suffer from motion blur when recording high-speed scenes.
We formulate the deblurring task on traditional cameras directed by events to be a residual learning one.
We propose corresponding network architectures for effective learning of deblurring and high frame rate video generation tasks.
arXiv Detail & Related papers (2020-03-02T13:02:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.