Fugu-MT 論文翻訳(概要): Training-free Latent Inter-Frame Pruning with Attention Recovery

論文の概要: Training-free Latent Inter-Frame Pruning with Attention Recovery

arxiv url: http://arxiv.org/abs/2603.05811v1
Date: Fri, 06 Mar 2026 01:49:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:44.884033
Title: Training-free Latent Inter-Frame Pruning with Attention Recovery
Title（参考訳）: 注意回復を伴う無トレーニングラテントフレームプルーニング
Authors: Dennis Menn, Yuedong Yang, Bokun Wang, Xiwen Wei, Mustafa Munir, Feng Liang, Radu Marculescu, Chenfeng Xu, Diana Marculescu,
Abstract要約: 現在のビデオ生成モデルは高い計算遅延に悩まされており、リアルタイムアプリケーションは非常にコストがかかる。本稿では,重複パッチの検出と再計算を行うLIPAR(Latent Inter-frame Pruning with Attention Recovery)フレームワークを提案する。 NVIDIA A6000で平均12.2FPSを達成した場合,ビデオ編集のスループットを$1.45 times$で向上させる。
参考スコア（独自算出の注目度）: 50.889009147480856
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Current video generation models suffer from high computational latency, making real-time applications prohibitively costly. In this paper, we address this limitation by exploiting the temporal redundancy inherent in video latent patches. To this end, we propose the Latent Inter-frame Pruning with Attention Recovery (LIPAR) framework, which detects and skips recomputing duplicated latent patches. Additionally, we introduce a novel Attention Recovery mechanism that approximates the attention values of pruned tokens, thereby removing visual artifacts arising from naively applying the pruning method. Empirically, our method increases video editing throughput by $1.45\times$, on average achieving 12.2 FPS on an NVIDIA A6000 compared to the baseline 8.4 FPS. The proposed method does not compromise generation quality and can be seamlessly integrated with the model without additional training. Our approach effectively bridges the gap between traditional compression algorithms and modern generative pipelines.
Abstract（参考訳）: 現在のビデオ生成モデルは高い計算遅延に悩まされており、リアルタイムアプリケーションは非常にコストがかかる。本稿では,ビデオ潜伏パッチに固有の時間的冗長性を利用して,この制限に対処する。この目的のために、複製された潜在パッチを検知・スキップするLIPAR(Latent Inter-frame Pruning with Attention Recovery)フレームワークを提案する。さらに,明細トークンの注意値を近似する新しい注意回復機構を導入し,明細トークンの点検による視覚的アーチファクトの除去を行う。提案手法は、NVIDIA A6000で平均12.2FPSを達成した場合、ビデオ編集のスループットを1.45FPS向上させる。提案手法は生成品質を損なうことなく,モデルとシームレスに統合できる。提案手法は,従来の圧縮アルゴリズムと現代的な生成パイプラインのギャップを効果的に埋めるものである。

論文の概要: Training-free Latent Inter-Frame Pruning with Attention Recovery

関連論文リスト