Fugu-MT 論文翻訳(概要): Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

論文の概要: Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

arxiv url: http://arxiv.org/abs/2508.07981v2
Date: Tue, 12 Aug 2025 03:46:18 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-13 12:16:51.427749
Title: Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Title（参考訳）: Omni-Effects:統一的かつ空間的に制御可能な視覚効果生成
Authors: Fangyuan Mao, Aiming Hao, Jintao Chen, Dongxia Liu, Xiaokun Feng, Jiashu Zhu, Meiqi Wu, Chubin Chen, Jiahong Wu, Xiangxiang Chu,
Abstract要約: オムニエフェクト(Omni-Effects)は、即時誘導効果と空間制御可能な複合効果を生成できるフレームワークである。 LoRAベースのMixture of Experts (LoRA-MoE)は、専門家グループであるLoRAを採用し、統一モデルに多様な効果を統合する。 Space-Aware Prompt (SAP) は、空間マスク情報をテキストトークンに組み込んで、正確な空間制御を可能にする。
参考スコア（独自算出の注目度）: 11.41864836442447
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual effects (VFX) are essential visual enhancements fundamental to modern cinematic production. Although video generation models offer cost-efficient solutions for VFX production, current methods are constrained by per-effect LoRA training, which limits generation to single effects. This fundamental limitation impedes applications that require spatially controllable composite effects, i.e., the concurrent generation of multiple effects at designated locations. However, integrating diverse effects into a unified framework faces major challenges: interference from effect variations and spatial uncontrollability during multi-VFX joint training. To tackle these challenges, we propose Omni-Effects, a first unified framework capable of generating prompt-guided effects and spatially controllable composite effects. The core of our framework comprises two key innovations: (1) LoRA-based Mixture of Experts (LoRA-MoE), which employs a group of expert LoRAs, integrating diverse effects within a unified model while effectively mitigating cross-task interference. (2) Spatial-Aware Prompt (SAP) incorporates spatial mask information into the text token, enabling precise spatial control. Furthermore, we introduce an Independent-Information Flow (IIF) module integrated within the SAP, isolating the control signals corresponding to individual effects to prevent any unwanted blending. To facilitate this research, we construct a comprehensive VFX dataset Omni-VFX via a novel data collection pipeline combining image editing and First-Last Frame-to-Video (FLF2V) synthesis, and introduce a dedicated VFX evaluation framework for validating model performance. Extensive experiments demonstrate that Omni-Effects achieves precise spatial control and diverse effect generation, enabling users to specify both the category and location of desired effects.
Abstract（参考訳）: 視覚効果(VFX)は、現代映画製作の基本となる視覚的強化である。ビデオ生成モデルは、VFX生産のためのコスト効率のよいソリューションを提供するが、現在の手法は、効果の少ないLORAトレーニングによって制約される。この基本的な制限は、空間的に制御可能な複合効果、すなわち指定された場所における多重効果の同時発生を必要とする応用を妨げる。しかし、多種多様な効果を統合されたフレームワークに統合することは、マルチVFXジョイントトレーニングにおける効果変動からの干渉と空間的不制御性といった大きな課題に直面している。これらの課題に対処するため、我々は、プロンプト誘導効果と空間制御可能な複合効果を生成できる最初の統一されたフレームワークであるOmni-Effectsを提案する。 1) LoRA-based Mixture of Experts (LoRA-MoE) 専門家のグループを雇い、統一モデルに多様な効果を統合すると同時に、クロスタスクの干渉を効果的に軽減します。 2)SAP(Spatial-Aware Prompt)は,テキストトークンに空間マスク情報を組み込んで,正確な空間制御を実現する。さらに、SAPに組み込まれた独立情報フロー(IIF)モジュールを導入し、個々の効果に対応する制御信号を分離し、不要なブレンディングを防止する。本研究では,画像編集とFLF2V(First-Last Frame-to-Video)合成を組み合わせた新しいデータ収集パイプラインを用いて,総合的なVFXデータセットOmni-VFXを構築し,モデル性能を検証するための専用のVFX評価フレームワークを提案する。大規模な実験により,Omni-Effectsは正確な空間制御と多様なエフェクト生成を実現し,ユーザが所望のエフェクトのカテゴリと場所を指定できることを示した。

論文の概要: Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

関連論文リスト