Fugu-MT 論文翻訳(概要): BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models

論文の概要: BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models

arxiv url: http://arxiv.org/abs/2512.12080v1
Date: Fri, 12 Dec 2025 23:02:02 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-16 17:54:56.102469
Title: BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models
Title（参考訳）: BAgger:自己回帰ビデオ拡散モデルにおけるドリフト緩和のための後方アグリゲーション
Authors: Ryan Po, Eric Ryan Chan, Changan Chen, Gordon Wetzstein,
Abstract要約: モデル自身のロールアウトから補正軌道を構築する自己教師型スキームであるバックワードアグリゲーション(BAgger)を導入する。数段階の蒸留と分配整合損失に依存する従来のアプローチとは異なり、BAggerは標準的なスコアやフローマッチングの目的を持つ列車である。因果拡散変換器でBAggerをインスタンス化し、テキスト・ツー・ビデオ、ビデオ・エクステンション、マルチプロンプト・ジェネレーションで評価する。
参考スコア（独自算出の注目度）: 50.986189632485285
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autoregressive video models are promising for world modeling via next-frame prediction, but they suffer from exposure bias: a mismatch between training on clean contexts and inference on self-generated frames, causing errors to compound and quality to drift over time. We introduce Backwards Aggregation (BAgger), a self-supervised scheme that constructs corrective trajectories from the model's own rollouts, teaching it to recover from its mistakes. Unlike prior approaches that rely on few-step distillation and distribution-matching losses, which can hurt quality and diversity, BAgger trains with standard score or flow matching objectives, avoiding large teachers and long-chain backpropagation through time. We instantiate BAgger on causal diffusion transformers and evaluate on text-to-video, video extension, and multi-prompt generation, observing more stable long-horizon motion and better visual consistency with reduced drift.
Abstract（参考訳）: 自動回帰ビデオモデルは、次のフレーム予測による世界モデリングを約束していますが、それらは露光バイアスに悩まされています。モデル自身のロールアウトから修正軌道を構築する自己教師型スキームであるBackwards Aggregation(BAgger)を導入する。品質と多様性を損なう数段階の蒸留と流通マッチングの損失に依存する従来のアプローチとは異なり、BAggerは標準的なスコアやフローマッチングの目標を持つ列車で、大規模な教師や長いチェーンのバックプロパゲーションを避ける。我々はBAggerを因果拡散変換器でインスタンス化し、テキスト・ツー・ビデオ、ビデオ・エクステンション、マルチプロンプト・ジェネレーションで評価し、より安定した長距離移動とドリフトの低減による視覚的整合性を観察する。

論文の概要: BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models

関連論文リスト