Joint Flow And Feature Refinement Using Attention For Video Restoration
- URL: http://arxiv.org/abs/2505.16434v1
- Date: Thu, 22 May 2025 09:18:51 GMT
- Title: Joint Flow And Feature Refinement Using Attention For Video Restoration
- Authors: Ranjith Merugu, Mohammad Sameer Suhail, Akshay P Sarashetti, Venkata Bharath Reddy Reddem, Pankaj Kumar Bajpai, Amit Satish Unde,
- Abstract summary: We propose a novel video restoration framework named Joint Flow and Feature Refinement using Attention (JFFRA)<n>Our method demonstrates a remarkable performance improvement of up to 1.62 dB compared to state-of-the-art approaches.
- Score: 0.3811713174618588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in video restoration have focused on recovering high-quality video frames from low-quality inputs. Compared with static images, the performance of video restoration significantly depends on efficient exploitation of temporal correlations among successive video frames. The numerous techniques make use of temporal information via flow-based strategies or recurrent architectures. However, these methods often encounter difficulties in preserving temporal consistency as they utilize degraded input video frames. To resolve this issue, we propose a novel video restoration framework named Joint Flow and Feature Refinement using Attention (JFFRA). The proposed JFFRA is based on key philosophy of iteratively enhancing data through the synergistic collaboration of flow (alignment) and restoration. By leveraging previously enhanced features to refine flow and vice versa, JFFRA enables efficient feature enhancement using temporal information. This interplay between flow and restoration is executed at multiple scales, reducing the dependence on precise flow estimation. Moreover, we incorporate an occlusion-aware temporal loss function to enhance the network's capability in eliminating flickering artifacts. Comprehensive experiments validate the versatility of JFFRA across various restoration tasks such as denoising, deblurring, and super-resolution. Our method demonstrates a remarkable performance improvement of up to 1.62 dB compared to state-of-the-art approaches.
Related papers
- Temporal-Consistent Video Restoration with Pre-trained Diffusion Models [51.47188802535954]
Video restoration (VR) aims to recover high-quality videos from degraded ones.<n>Recent zero-shot VR methods using pre-trained diffusion models (DMs) suffer from approximation errors during reverse diffusion and insufficient temporal consistency.<n>We present a novel a Posterior Maximum (MAP) framework that directly parameterizes video frames in the seed space of DMs, eliminating approximation errors.
arXiv Detail & Related papers (2025-03-19T03:41:56Z) - Rethinking Video Tokenization: A Conditioned Diffusion-based Approach [58.164354605550194]
New tokenizer, Diffusion Conditioned-based Gene Tokenizer, replaces GAN-based decoder with conditional diffusion model.<n>We trained using only a basic MSE diffusion loss for reconstruction, along with KL term and LPIPS perceptual loss from scratch.<n>Even a scaled-down version of CDT (3$times inference speedup) still performs comparably with top baselines.
arXiv Detail & Related papers (2025-03-05T17:59:19Z) - SVFR: A Unified Framework for Generalized Video Face Restoration [86.17060212058452]
Face Restoration (FR) is a crucial area within image and video processing, focusing on reconstructing high-quality portraits from degraded inputs.<n>We propose a novel approach for the Generalized Video Face Restoration task, which integrates video BFR, inpainting, and colorization tasks.<n>This work advances the state-of-the-art in video FR and establishes a new paradigm for generalized video face restoration.
arXiv Detail & Related papers (2025-01-02T12:51:20Z) - Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion [15.188335671278024]
We propose a new video inpainting framework using optical Flow-guided Efficient Diffusion (FloED) for higher video coherence.<n>FloED employs a dual-branch architecture, where the time-agnostic flow branch restores corrupted flow first, and the multi-scale flow adapters provide motion guidance to the main inpainting branch.<n>Experiments on background restoration and object removal tasks show that FloED outperforms state-of-the-art diffusion-based methods in both quality and efficiency.
arXiv Detail & Related papers (2024-12-01T15:45:26Z) - Collaborative Feedback Discriminative Propagation for Video Super-Resolution [66.61201445650323]
Key success of video super-resolution (VSR) methods stems mainly from exploring spatial and temporal information.
Inaccurate alignment usually leads to aligned features with significant artifacts.
propagation modules only propagate the same timestep features forward or backward.
arXiv Detail & Related papers (2024-04-06T22:08:20Z) - FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring [28.626224230377108]
We present a joint learning scheme of video super-resolution and deblurring, called VSRDB, to restore clean high-resolution (HR) videos from blurry low-resolution (LR) ones.
We propose a novel flow-guided dynamic filtering (FGDF) and iterative feature refinement with multi-attention (FRMA) framework.
arXiv Detail & Related papers (2024-01-08T07:34:43Z) - Toward Accurate and Temporally Consistent Video Restoration from Raw
Data [20.430231283171327]
We present a new VJDD framework by consistent and accurate latent space propagation.
The proposed losses can circumvent the error accumulation problem caused by inaccurate flow estimation.
Experiments demonstrate the leading VJDD performance in term of restoration accuracy, perceptual quality and temporal consistency.
arXiv Detail & Related papers (2023-12-25T12:38:03Z) - FRDiff : Feature Reuse for Universal Training-free Acceleration of Diffusion Models [16.940023904740585]
We introduce an advanced acceleration technique that leverages the temporal redundancy inherent in diffusion models.
Reusing feature maps with high temporal similarity opens up a new opportunity to save computation resources without compromising output quality.
arXiv Detail & Related papers (2023-12-06T14:24:26Z) - Exploring Long- and Short-Range Temporal Information for Learned Video
Compression [54.91301930491466]
We focus on exploiting the unique characteristics of video content and exploring temporal information to enhance compression performance.
For long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference.
In that case temporal prior contains valuable temporal information of all decoded images within the current GOP.
In detail, we design a hierarchical structure to achieve multi-scale compensation.
arXiv Detail & Related papers (2022-08-07T15:57:18Z) - Recurrent Video Restoration Transformer with Guided Deformable Attention [116.1684355529431]
We propose RVRT, which processes local neighboring frames in parallel within a globally recurrent framework.
RVRT achieves state-of-the-art performance on benchmark datasets with balanced model size, testing memory and runtime.
arXiv Detail & Related papers (2022-06-05T10:36:09Z) - Boosting the Performance of Video Compression Artifact Reduction with
Reference Frame Proposals and Frequency Domain Information [31.053879834073502]
We propose an effective reference frame proposal strategy to boost the performance of the existing multi-frame approaches.
Experimental results show that our method achieves better fidelity and perceptual performance on MFQE 2.0 dataset than the state-of-the-art methods.
arXiv Detail & Related papers (2021-05-31T13:46:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.