Fugu-MT 論文翻訳(概要): CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos

論文の概要: CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos

arxiv url: http://arxiv.org/abs/2512.12060v1
Date: Fri, 12 Dec 2025 22:03:14 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-16 17:54:56.08788
Title: CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos
Title（参考訳）: CreativeVR: ジェネレーティブビデオとリアルビデオにおける構造と動きの復元のための拡散パラメータ誘導アプローチ
Authors: Tejas Panambur, Ishan Rajendrakumar Dave, Chongjian Ge, Ersin Yumer, Xue Bai,
Abstract要約: CreativeVRはAIGC(AIGC)と、厳格な構造と時間的アーティファクトを備えた実ビデオのための拡散優先のビデオ復元フレームワークである。我々のDeep-Adapter-based methodは、モデルが入力にどれだけ強く従うかを制御する単一の精度ノブを公開する。 CreativeVRは、厳しいアーティファクトを持つビデオの最先端の結果を達成し、標準的なビデオ復元ベンチマークで競争的に実行します。
参考スコア（独自算出の注目度）: 17.81372151946937
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern text-to-video (T2V) diffusion models can synthesize visually compelling clips, yet they remain brittle at fine-scale structure: even state-of-the-art generators often produce distorted faces and hands, warped backgrounds, and temporally inconsistent motion. Such severe structural artifacts also appear in very low-quality real-world videos. Classical video restoration and super-resolution (VR/VSR) methods, in contrast, are tuned for synthetic degradations such as blur and downsampling and tend to stabilize these artifacts rather than repair them, while diffusion-prior restorers are usually trained on photometric noise and offer little control over the trade-off between perceptual quality and fidelity. We introduce CreativeVR, a diffusion-prior-guided video restoration framework for AI-generated (AIGC) and real videos with severe structural and temporal artifacts. Our deep-adapter-based method exposes a single precision knob that controls how strongly the model follows the input, smoothly trading off between precise restoration on standard degradations and stronger structure- and motion-corrective behavior on challenging content. Our key novelty is a temporally coherent degradation module used during training, which applies carefully designed transformations that produce realistic structural failures. To evaluate AIGC-artifact restoration, we propose the AIGC54 benchmark with FIQA, semantic and perceptual metrics, and multi-aspect scoring. CreativeVR achieves state-of-the-art results on videos with severe artifacts and performs competitively on standard video restoration benchmarks, while running at practical throughput (about 13 FPS at 720p on a single 80-GB A100). Project page: https://daveishan.github.io/creativevr-webpage/.
Abstract（参考訳）: 現代のテキスト・トゥ・ビデオ(T2V)拡散モデルは視覚的に魅力的なクリップを合成することができるが、細かな構造では不安定であり、最先端のジェネレータでさえしばしば歪んだ顔や手、歪んだ背景、時間的に一貫性のない動きを生成する。このような厳しい構造的アーティファクトは、非常に低品質の現実世界のビデオにも現れる。対照的に、古典的なビデオ修復と超解像法(VR/VSR)は、ぼかしやダウンサンプリングのような合成劣化を調整し、これらを修復するよりも安定化させる傾向がある。本稿では,AIGC(AIGC)と重度構造と時間的アーティファクトを備えた実ビデオのための拡散誘導型ビデオ復元フレームワークCreativeVRを紹介する。我々のディープアダプターベースの手法は、モデルが入力にどれだけ強く従うかを制御する単一の精度ノブを公開する。私たちの重要なノベルティは、トレーニング中に使用される時間的コヒーレントな劣化モジュールです。そこで我々は,AIGC54ベンチマークをFIQA,意味的および知覚的指標,マルチアスペクトスコアを用いて評価する。 CreativeVRは、厳しいアーティファクトを持つビデオの最先端の結果を達成し、実際のスループット(80GB A100で720pで約13FPS)で、標準的なビデオ復元ベンチマークで競争力を発揮する。プロジェクトページ: https://daveishan.github.io/creativevr-webpage/.com

論文の概要: CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos

関連論文リスト