Fugu-MT 論文翻訳(概要): SketchKeyAnime: Reference-anchored Sparse Key-Sketch Animation Synthesis

論文の概要: SketchKeyAnime: Reference-anchored Sparse Key-Sketch Animation Synthesis

arxiv url: http://arxiv.org/abs/2606.19958v1
Date: Thu, 18 Jun 2026 08:56:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.749308
Title: SketchKeyAnime: Reference-anchored Sparse Key-Sketch Animation Synthesis
Title（参考訳）: SketchKeyAnime: 参照アンコールスパースキー-スケッチアニメーション合成
Authors: Meixi Li, Xianlin Zhang, Yue Zhang, Xueming Li,
Abstract要約: そこで,SketchKeyAnimeを提案する。SketchKeyAnimeは,スパースキー-スケッチ入力から構造制御可能,外観一貫性,時間コヒーレントなアニメーションを生成するための,ビデオ拡散フレームワークである。最高のパフォーマンスのベースラインと比較して、SketchKeyAnimeはEDMDを31.9%、FVDを9.5%削減し、スケッチの忠実さと時間的コヒーレンスを向上した。これらの結果は,提案手法の有効性を検証し,低コストで高制御性なアニメーション作成の可能性を強調した。
参考スコア（独自算出の注目度）: 9.52552432045786
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional animation production relies heavily on manual drawing and iterative refinement, particularly for key-pose design, in-betweening, and character coloring. While existing animation and video generation methods have made notable progress, they typically depend on RGB boundary frames, dense frame-wise conditions, or complete sketch sequences, limiting their applicability under low-cost input conditions. We present SketchKeyAnime, a video diffusion framework for generating structurally controllable, appearance-consistent, and temporally coherent animations from sparse key-sketch inputs. Given a single reference RGB image and a few temporally indexed key sketches, SketchKeyAnime introduces a dual-branch conditioning mechanism to encode local geometric constraints alongside semantic-temporal context. It leverages Sketch Cross Attention to fuse reference image and sketch conditions with learnable gating, and incorporates an Adaptive Weighted Loss to strengthen supervision on key-sketch frames and line-art regions. Experimental results on the Aesthetic subset of Sakuga-42M show that our approach consistently outperforms representative animation interpolation and sketch-guided generation baselines. Compared to the best-performing baseline, SketchKeyAnime reduces EDMD by 31.9\% and FVD by 9.5\%, demonstrating superior sketch fidelity and temporal coherence, while achieving the best overall performance across most quantitative metrics. These results validate the proposed framework and highlight its potential for low-cost, highly controllable animation creation.
Abstract（参考訳）: 伝統的なアニメーション・プロダクションは手書きの描画と反復的な洗練に大きく依存しており、特にキー・プレイス・デザイン、イン・バイ・ブワイニング、キャラクタ・カラーリングに重点を置いている。既存のアニメーションやビデオ生成手法は顕著な進歩を遂げているが、それらは通常、RGB境界フレーム、密度の高いフレームワイド条件、あるいは完全なスケッチシーケンスに依存し、低コストな入力条件下での適用性を制限する。 SketchKeyAnimeは,スパースキー・スケッチ入力から構造制御可能,外観一貫性,時間コヒーレントなアニメーションを生成するためのビデオ拡散フレームワークである。単一の参照RGBイメージといくつかの時間インデックス付きキースケッチを与えられたSketchKeyAnimeは、意味的時間的コンテキストと並行して局所的な幾何学的制約をエンコードするデュアルブランチ条件付けメカニズムを導入した。これはSketch Cross Attentionを活用して、参照イメージとスケッチ条件を学習可能なゲーティングと融合させ、Adaptive Weighted Lossを組み込んで、キースケッチフレームとラインアート領域の監視を強化する。 Sakuga-42Mの美的部分集合に対する実験結果から,本手法は代表的アニメーション補間とスケッチ誘導生成ベースラインを一貫して上回ることがわかった。最高のパフォーマンスのベースラインと比較すると、SketchKeyAnimeはEDMDを31.9\%、FVDを9.5\%減らし、スケッチの忠実さと時間的コヒーレンスを向上し、ほとんどのメトリクスで最高の全体的なパフォーマンスを達成する。これらの結果は,提案手法の有効性を検証し,低コストで高制御性なアニメーション作成の可能性を強調した。

論文の概要: SketchKeyAnime: Reference-anchored Sparse Key-Sketch Animation Synthesis

関連論文リスト