SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
- URL: http://arxiv.org/abs/2506.05301v1
- Date: Thu, 05 Jun 2025 17:51:05 GMT
- Title: SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
- Authors: Jianyi Wang, Shanchuan Lin, Zhijie Lin, Yuxi Ren, Meng Wei, Zongsheng Yue, Shangchen Zhou, Hao Chen, Yang Zhao, Ceyuan Yang, Xuefeng Xiao, Chen Change Loy, Lu Jiang,
- Abstract summary: We propose a one-step diffusion-based VR model, termed as SeedVR2, which performs adversarial VR training against real data.<n>To handle the challenging high-resolution VR within a single step, we introduce several enhancements to both model architecture and training procedures.
- Score: 82.68200031146299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in diffusion-based video restoration (VR) demonstrate significant improvement in visual quality, yet yield a prohibitive computational cost during inference. While several distillation-based approaches have exhibited the potential of one-step image restoration, extending existing approaches to VR remains challenging and underexplored, particularly when dealing with high-resolution video in real-world settings. In this work, we propose a one-step diffusion-based VR model, termed as SeedVR2, which performs adversarial VR training against real data. To handle the challenging high-resolution VR within a single step, we introduce several enhancements to both model architecture and training procedures. Specifically, an adaptive window attention mechanism is proposed, where the window size is dynamically adjusted to fit the output resolutions, avoiding window inconsistency observed under high-resolution VR using window attention with a predefined window size. To stabilize and improve the adversarial post-training towards VR, we further verify the effectiveness of a series of losses, including a proposed feature matching loss without significantly sacrificing training efficiency. Extensive experiments show that SeedVR2 can achieve comparable or even better performance compared with existing VR approaches in a single step.
Related papers
- VRSplat: Fast and Robust Gaussian Splatting for Virtual Reality [47.738522999465864]
We introduce VRSplat: we combine and extend several recent advancements in 3DGS to address challenges of VR holistically.<n> VRSplat is the first, systematically evaluated 3DGS approach capable of supporting modern VR applications, achieving 72+ FPS while eliminating popping and stereo-disrupting floaters.
arXiv Detail & Related papers (2025-05-15T10:17:48Z) - DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer [56.98400572837792]
DiVE produces high-fidelity, temporally coherent, and cross-view consistent multi-view videos.<n>These innovations collectively achieve a 2.62x speedup with minimal quality degradation.
arXiv Detail & Related papers (2025-04-28T09:20:50Z) - Temporal-Consistent Video Restoration with Pre-trained Diffusion Models [51.47188802535954]
Video restoration (VR) aims to recover high-quality videos from degraded ones.<n>Recent zero-shot VR methods using pre-trained diffusion models (DMs) suffer from approximation errors during reverse diffusion and insufficient temporal consistency.<n>We present a novel a Posterior Maximum (MAP) framework that directly parameterizes video frames in the seed space of DMs, eliminating approximation errors.
arXiv Detail & Related papers (2025-03-19T03:41:56Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration [73.70209718408641]
SeedVR is a diffusion transformer designed to handle real-world video restoration with arbitrary length and resolution.<n>It achieves highly-competitive performance on both synthetic and real-world benchmarks, as well as AI-generated videos.
arXiv Detail & Related papers (2025-01-02T16:19:48Z) - Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video
Restoration [85.3323211054274]
How to properly model the inter-frame relation within the video sequence is an important but unsolved challenge for video restoration (VR)
In this work, we propose an unsupervised flow-aligned sequence-to-sequence model (S2SVR) to address this problem.
S2SVR shows superior performance in multiple VR tasks, including video deblurring, video super-resolution, and compressed video quality enhancement.
arXiv Detail & Related papers (2022-05-20T14:14:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.