Fugu-MT 論文翻訳(概要): Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation

論文の概要: Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation

arxiv url: http://arxiv.org/abs/2509.16630v1
Date: Sat, 20 Sep 2025 11:09:01 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-23 18:58:15.896473
Title: Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation
Title（参考訳）: Follow-Your-Emoji-Faster: 効率的、細調整可能、表現的フリースタイルのポートレートアニメーションを目指して
Authors: Yue Ma, Zexuan Yan, Hongyu Liu, Hongfa Wang, Heng Pan, Yingqing He, Junkun Yuan, Ailing Zeng, Chengfei Cai, Heung-Yeung Shum, Zhifeng Li, Wei Liu, Linfeng Zhang, Qifeng Chen,
Abstract要約: Follow-Your-Emoji-Fasterは、顔のランドマークによって駆動されるポートレートアニメーションのための効率的な拡散ベースのフレームワークである。我々のモデルは、現実の顔、漫画、彫刻、動物など、さまざまな肖像画タイプにまたがる、コントロール可能な、表現可能なアニメーションをサポートします。 EmojiBench++は、さまざまなポートレート、動画の駆動、ランドマークシーケンスで構成される、より包括的なベンチマークである。
参考スコア（独自算出の注目度）: 72.20148916920944
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present Follow-Your-Emoji-Faster, an efficient diffusion-based framework for freestyle portrait animation driven by facial landmarks. The main challenges in this task are preserving the identity of the reference portrait, accurately transferring target expressions, and maintaining long-term temporal consistency while ensuring generation efficiency. To address identity preservation and accurate expression retargeting, we enhance Stable Diffusion with two key components: a expression-aware landmarks as explicit motion signals, which improve motion alignment, support exaggerated expressions, and reduce identity leakage; and a fine-grained facial loss that leverages both expression and facial masks to better capture subtle expressions and faithfully preserve the reference appearance. With these components, our model supports controllable and expressive animation across diverse portrait types, including real faces, cartoons, sculptures, and animals. However, diffusion-based frameworks typically struggle to efficiently generate long-term stable animation results, which remains a core challenge in this task. To address this, we propose a progressive generation strategy for stable long-term animation, and introduce a Taylor-interpolated cache, achieving a 2.6X lossless acceleration. These two strategies ensure that our method produces high-quality results efficiently, making it user-friendly and accessible. Finally, we introduce EmojiBench++, a more comprehensive benchmark comprising diverse portraits, driving videos, and landmark sequences. Extensive evaluations on EmojiBench++ demonstrate that Follow-Your-Emoji-Faster achieves superior performance in both animation quality and controllability. The code, training dataset and benchmark will be found in https://follow-your-emoji.github.io/.
Abstract（参考訳）: Follow-Your-Emoji-Fasterは、顔のランドマークによって駆動されるフリースタイルのポートレートアニメーションのための効率的な拡散ベースのフレームワークである。このタスクの主な課題は、参照ポートレートのアイデンティティを保存し、ターゲット表現を正確に転送し、生成効率を確保しながら、長期の時間的一貫性を維持することである。表情認識のランドマークを明示的な動作信号として,動きのアライメントを改善し,誇張表現をサポートし,アイデンティティのリークを低減するとともに,表情と顔のマスクを併用して,微妙な表情をよりよく捉え,参照外観を忠実に保存する,きめ細かな顔の喪失を解消する。これらのコンポーネントにより、われわれのモデルは、リアルな顔、漫画、彫刻、動物など、さまざまな肖像画タイプにまたがる、コントロール可能で表現可能なアニメーションをサポートします。しかし、拡散ベースのフレームワークは通常、長期的な安定したアニメーション結果を生成するのに苦労する。そこで本研究では,安定な長期アニメーションのプログレッシブ生成戦略を提案し,Taylor補間キャッシュを導入し,2.6倍のロスレス加速を実現する。これら2つの戦略により,提案手法は高品質な結果を効率よく生成し,ユーザフレンドリでアクセスしやすいものにする。最後に、さまざまなポートレート、動画の駆動、ランドマークシーケンスからなるより包括的なベンチマークであるEmojiBench++を紹介します。 EmojiBench++の大規模な評価では、Follow-Your-Emoji-Fasterはアニメーションの品質と制御性の両方において優れたパフォーマンスを実現している。コード、トレーニングデータセット、ベンチマークはhttps://follow-your-emoji.github.io/にある。

論文の概要: Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation

関連論文リスト