Related papers: Revisiting Temporal Alignment for Video Restoration

Revisiting Temporal Alignment for Video Restoration

URL: http://arxiv.org/abs/2111.15288v2
Date: Wed, 1 Dec 2021 05:11:47 GMT
Title: Revisiting Temporal Alignment for Video Restoration
Authors: Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, Jiangbo Lu
Abstract summary: Long-range temporal alignment is critical yet challenging for video restoration tasks. We present a novel, generic iterative alignment module which employs a gradual refinement scheme for sub-alignments. Our model achieves state-of-the-art performance on multiple benchmarks across a range of video restoration tasks.
Score: 39.05100686559188
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Long-range temporal alignment is critical yet challenging for video restoration tasks. Recently, some works attempt to divide the long-range alignment into several sub-alignments and handle them progressively. Although this operation is helpful in modeling distant correspondences, error accumulation is inevitable due to the propagation mechanism. In this work, we present a novel, generic iterative alignment module which employs a gradual refinement scheme for sub-alignments, yielding more accurate motion compensation. To further enhance the alignment accuracy and temporal consistency, we develop a non-parametric re-weighting method, where the importance of each neighboring frame is adaptively evaluated in a spatial-wise way for aggregation. By virtue of the proposed strategies, our model achieves state-of-the-art performance on multiple benchmarks across a range of video restoration tasks including video super-resolution, denoising and deblurring. Our project is available in \url{https://github.com/redrock303/Revisiting-Temporal-Alignment-for-Video-Restoration.git}.

Related papers

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better [61.381599921020175]
Temporal consistency is critical in video prediction to ensure that outputs are coherent and free of artifacts.<n>Traditional methods, such as temporal attention and 3D convolution, may struggle with significant object motion.<n>We propose the Tracktention Layer, a novel architectural component that explicitly integrates motion information using point tracks.
arXiv Detail & Related papers (2025-03-25T17:58:48Z)
Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction [3.9363268745580426]
AT-GS is a novel method for reconstructing high-quality dynamic surfaces from multi-view videos through per-frame incremental optimization. We reduce temporal jittering in dynamic surfaces by ensuring consistency in curvature maps across consecutive frames. Our method achieves superior accuracy and temporal coherence in dynamic surface reconstruction, delivering high-fidelity space-time novel view synthesis.
arXiv Detail & Related papers (2024-11-10T21:30:16Z)
Collaborative Feedback Discriminative Propagation for Video Super-Resolution [66.61201445650323]
Key success of video super-resolution (VSR) methods stems mainly from exploring spatial and temporal information. Inaccurate alignment usually leads to aligned features with significant artifacts. propagation modules only propagate the same timestep features forward or backward.
arXiv Detail & Related papers (2024-04-06T22:08:20Z)
Local-Global Temporal Difference Learning for Satellite Video Super-Resolution [55.69322525367221]
We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation. To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies. Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-10T07:04:40Z)
Temporal Consistency Learning of inter-frames for Video Super-Resolution [38.26035126565062]
Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames. Existing methods generally explore information propagation and frame alignment to improve the performance of VSR. We propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos.
arXiv Detail & Related papers (2022-11-03T08:23:57Z)
Distortion-Aware Network Pruning and Feature Reuse for Real-time Video Segmentation [49.17930380106643]
We propose a novel framework to speed up any architecture with skip-connections for real-time vision tasks. Specifically, at the arrival of each frame, we transform the features from the previous frame to reuse them at specific spatial bins. We then perform partial computation of the backbone network on the regions of the current frame that captures temporal differences between the current and previous frame.
arXiv Detail & Related papers (2022-06-20T07:20:02Z)
Recurrent Video Restoration Transformer with Guided Deformable Attention [116.1684355529431]
We propose RVRT, which processes local neighboring frames in parallel within a globally recurrent framework. RVRT achieves state-of-the-art performance on benchmark datasets with balanced model size, testing memory and runtime.
arXiv Detail & Related papers (2022-06-05T10:36:09Z)
Coarse to Fine Multi-Resolution Temporal Convolutional Network [25.08516972520265]
We propose a novel temporal encoder-decoder to tackle the problem of sequence fragmentation. The decoder follows a coarse-to-fine structure with an implicit ensemble of multiple temporal resolutions. Experiments show that our stand-alone architecture, together with our novel feature-augmentation strategy and new loss, outperforms the state-of-the-art on three temporal video segmentation benchmarks.
arXiv Detail & Related papers (2021-05-23T06:07:40Z)
Intrinsic Temporal Regularization for High-resolution Human Video Synthesis [59.54483950973432]
temporal consistency is crucial for extending image processing pipelines to the video domain. We propose an effective intrinsic temporal regularization scheme, where an intrinsic confidence map is estimated via the frame generator to regulate motion estimation. We apply our intrinsic temporal regulation to single-image generator, leading to a powerful " INTERnet" capable of generating $512times512$ resolution human action videos.
arXiv Detail & Related papers (2020-12-11T05:29:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.