Related papers: Combining Internal and External Constraints for Unrolling Shutter in Videos

Combining Internal and External Constraints for Unrolling Shutter in Videos

URL: http://arxiv.org/abs/2207.11725v1
Date: Sun, 24 Jul 2022 12:01:27 GMT
Title: Combining Internal and External Constraints for Unrolling Shutter in Videos
Authors: Eyal Naor and Itai Antebi and Shai Bagon and Michal Irani
Abstract summary: We propose a space-time solution to the RS problem. We observe that a RS video and its corresponding GS video tend to share the exact same xt slices -- up to a known sub-frame temporal shift. This allows to constrain the GS output video using video-specific constraints imposed by the RS input video.
Score: 10.900978946948095
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Videos obtained by rolling-shutter (RS) cameras result in spatially-distorted frames. These distortions become significant under fast camera/scene motions. Undoing effects of RS is sometimes addressed as a spatial problem, where objects need to be rectified/displaced in order to generate their correct global shutter (GS) frame. However, the cause of the RS effect is inherently temporal, not spatial. In this paper we propose a space-time solution to the RS problem. We observe that despite the severe differences between their xy frames, a RS video and its corresponding GS video tend to share the exact same xt slices -- up to a known sub-frame temporal shift. Moreover, they share the same distribution of small 2D xt-patches, despite the strong temporal aliasing within each video. This allows to constrain the GS output video using video-specific constraints imposed by the RS input video. Our algorithm is composed of 3 main components: (i) Dense temporal upsampling between consecutive RS frames using an off-the-shelf method, (which was trained on regular video sequences), from which we extract GS "proposals". (ii) Learning to correctly merge an ensemble of such GS "proposals" using a dedicated MergeNet. (iii) A video-specific zero-shot optimization which imposes the similarity of xt-patches between the GS output video and the RS input video. Our method obtains state-of-the-art results on benchmark datasets, both numerically and visually, despite being trained on a small synthetic RS/GS dataset. Moreover, it generalizes well to new complex RS videos with motion types outside the distribution of the training set (e.g., complex non-rigid motions) -- videos which competing methods trained on much more data cannot handle well. We attribute these generalization capabilities to the combination of external and internal constraints.

Related papers

Exploiting Temporal State Space Sharing for Video Semantic Segmentation [53.8810901249897]
Video semantic segmentation (VSS) plays a vital role in understanding the temporal evolution of scenes. Traditional methods often segment videos frame-by-frame or in a short temporal window, leading to limited temporal context, redundant computations, and heavy memory requirements. We introduce a Temporal Video State Space Sharing architecture to leverage Mamba state space models for temporal feature sharing. Our model features a selective gating mechanism that efficiently propagates relevant information across video frames, eliminating the need for a memory-heavy feature pool.
arXiv Detail & Related papers (2025-03-26T01:47:42Z)
Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames [7.448238372345631]
Event cameras offer high temporal resolution.<n>We propose a framework to recover global shutter (GS) high-frame-rate videos without RS distortion.
arXiv Detail & Related papers (2023-06-27T14:30:25Z)
UniINR: Event-guided Unified Rolling Shutter Correction, Deblurring, and Interpolation [20.866360240444426]
Video frames captured by rolling shutter (RS) cameras during fast camera movement frequently exhibit RS distortion and blur simultaneously. We propose the first and novel approach, named UniINR, to recover arbitrary frame-rate sharp GS frames from an RS blur frame and paired events. Our method features a lightweight model with only 0.38M parameters, and it also enjoys high efficiency, achieving 2.83ms/frame in 31 times frame-rate of an RS blur frame.
arXiv Detail & Related papers (2023-05-24T11:57:03Z)
Self-Supervised Scene Dynamic Recovery from Rolling Shutter Images and Events [63.984927609545856]
Event-based Inter/intra-frame Compensator (E-IC) is proposed to predict the per-pixel dynamic between arbitrary time intervals. We show that the proposed method achieves state-of-the-art and shows remarkable performance for event-based RS2GS inversion in real-world scenarios.
arXiv Detail & Related papers (2023-04-14T05:30:02Z)
Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction [54.00007868515432]
Existing methods face challenges in estimating the accurate correction field due to the uniform velocity assumption. We propose a geometry-based Quadratic Rolling Shutter (QRS) motion solver, which precisely estimates the high-order correction field of individual pixels. Our method surpasses the state-of-the-art by +4.98, +0.77, and +4.33 of PSNR on Carla-RS, Fastec-RS, and BS-RSC datasets, respectively.
arXiv Detail & Related papers (2023-03-31T15:09:18Z)
Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video [111.08121952640766]
This paper presents a novel deep-learning based solution to the RS temporal super-resolution problem. By leveraging the multi-view geometry relationship of the RS imaging process, our framework successfully achieves high framerate GS generation. Our method can produce high-quality GS image sequences with rich details, outperforming the state-of-the-art methods.
arXiv Detail & Related papers (2022-10-06T16:47:12Z)
VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate. We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z)
Context-Aware Video Reconstruction for Rolling Shutter Cameras [52.28710992548282]
In this paper, we propose a context-aware GS video reconstruction architecture. We first estimate the bilateral motion field so that the pixels of the two RS frames are warped to a common GS frame. Then, a refinement scheme is proposed to guide the GS frame synthesis along with bilateral occlusion masks to produce high-fidelity GS video frames.
arXiv Detail & Related papers (2022-05-25T17:05:47Z)
Bringing Rolling Shutter Images Alive with Dual Reversed Distortion [75.78003680510193]
Rolling shutter (RS) distortion can be interpreted as the result of picking a row of pixels from instant global shutter (GS) frames over time. We develop a novel end-to-end model, IFED, to generate dual optical flow sequence through iterative learning of the velocity field during the RS time.
arXiv Detail & Related papers (2022-03-12T14:57:49Z)
Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning [11.658606722158517]
We train a video-specific CNN on examples extracted directly from the low-framerate input video. Our method exploits the strong recurrence of small space-time patches inside a single video sequence. The higher spatial resolution of video frames provides strong examples as to how to increase the temporal temporal resolution of that video.
arXiv Detail & Related papers (2020-03-19T15:53:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.