Fugu-MT 論文翻訳(概要): Controllable Generative Video Compression

論文の概要: Controllable Generative Video Compression

arxiv url: http://arxiv.org/abs/2604.06655v1
Date: Wed, 08 Apr 2026 04:11:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.330777
Title: Controllable Generative Video Compression
Title（参考訳）: 制御可能な生成ビデオ圧縮
Authors: Ding Ding, Daowen Li, Ying Chen, Yixin Gao, Ruixiao Dong, Kai Li, Li Li,
Abstract要約: 我々は,複数の視覚条件でガイドされた詳細を忠実に生成するためのCGVCパラダイムを提案する。 CGVCは、信号の忠実度と知覚品質の両方の観点から、従来の知覚ビデオ圧縮法より優れている。
参考スコア（独自算出の注目度）: 11.376749911846302
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Perceptual video compression adopts generative video modeling to improve perceptual realism but frequently sacrifices signal fidelity, diverging from the goal of video compression to faithfully reproduce visual signal. To alleviate the dilemma between perception and fidelity, in this paper we propose Controllable Generative Video Compression (CGVC) paradigm to faithfully generate details guided by multiple visual conditions. Under the paradigm, representative keyframes of the scene are coded and used to provide structural priors for non-keyframe generation. Dense per-frame control prior is additionally coded to better preserve finer structure and semantics of each non-keyframe. Guided by these priors, non-keyframes are reconstructed by controllable video generation model with temporal and content consistency. Furthermore, to accurately recover color information of the video, we develop a color-distance-guided keyframe selection algorithm to adaptively choose keyframes. Experimental results show CGVC outperforms previous perceptual video compression method in terms of both signal fidelity and perceptual quality.
Abstract（参考訳）: 知覚的ビデオ圧縮は、知覚的リアリズムを改善するために生成的ビデオモデリングを採用するが、しばしば信号の忠実さを犠牲にし、映像圧縮の目標から切り離して、忠実に視覚的信号を再現する。本稿では,知覚と忠実性の両面のジレンマを軽減するために,複数の視覚条件でガイドされた詳細を忠実に生成するCGVCパラダイムを提案する。このパラダイムの下では、シーンの代表的なキーフレームがコーディングされ、非キーフレーム生成のための構造的な事前情報を提供するために使用される。フレーム単位の詳細な制御は、キーでない各フレームのより微細な構造とセマンティクスをよりよく保存するためにコード化される。これらの先行によって導かれる非キーフレームは、時間的および内容的整合性を持った制御可能なビデオ生成モデルによって再構成される。さらに、映像の色情報を正確に復元するために、キーフレームを適応的に選択する色距離誘導キーフレーム選択アルゴリズムを開発した。実験の結果,CGVCは信号の忠実度と知覚品質の両方の観点から,従来の知覚ビデオ圧縮法よりも優れていた。

論文の概要: Controllable Generative Video Compression

関連論文リスト