VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning
- URL: http://arxiv.org/abs/2510.25772v1
- Date: Wed, 29 Oct 2025 17:59:53 GMT
- Title: VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning
- Authors: Baolu Li, Yiming Zhang, Qinghe Wang, Liqian Ma, Xiaoyu Shi, Xintao Wang, Pengfei Wan, Zhenfei Yin, Yunzhi Zhuge, Huchuan Lu, Xu Jia,
- Abstract summary: We introduce VFXMaster, the first unified, reference-based framework for VFX video generation.<n>It recasts effect generation as an in-context learning task, enabling it to reproduce diverse dynamic effects from a reference video onto target content.<n>In addition, we propose an efficient one-shot effect adaptation mechanism to boost generalization capability on tough unseen effects from a single user-provided video rapidly.
- Score: 67.44716618860544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual effects (VFX) are crucial to the expressive power of digital media, yet their creation remains a major challenge for generative AI. Prevailing methods often rely on the one-LoRA-per-effect paradigm, which is resource-intensive and fundamentally incapable of generalizing to unseen effects, thus limiting scalability and creation. To address this challenge, we introduce VFXMaster, the first unified, reference-based framework for VFX video generation. It recasts effect generation as an in-context learning task, enabling it to reproduce diverse dynamic effects from a reference video onto target content. In addition, it demonstrates remarkable generalization to unseen effect categories. Specifically, we design an in-context conditioning strategy that prompts the model with a reference example. An in-context attention mask is designed to precisely decouple and inject the essential effect attributes, allowing a single unified model to master the effect imitation without information leakage. In addition, we propose an efficient one-shot effect adaptation mechanism to boost generalization capability on tough unseen effects from a single user-provided video rapidly. Extensive experiments demonstrate that our method effectively imitates various categories of effect information and exhibits outstanding generalization to out-of-domain effects. To foster future research, we will release our code, models, and a comprehensive dataset to the community.
Related papers
- Motion Attribution for Video Generation [97.2515042185441]
We present Motive, a motion-centric, gradient-based data attribution framework.<n>We use it to study which fine-tuning clips improve or degrade temporal dynamics.<n>To our knowledge, this is the first framework to attribute motion rather than visual appearance in video generative models.
arXiv Detail & Related papers (2026-01-13T18:59:09Z) - Tuning-free Visual Effect Transfer across Videos [91.93897438317397]
RefVFX is a framework that transfers complex temporal effects from a reference video onto a target video or image in a feed-forward manner.<n>We introduce a large-scale dataset of triplets, where each triplet consists of a reference effect video, an input image or video, and a corresponding output video.<n>We show that RefVFX produces visually consistent and temporally coherent edits, generalizes across unseen effect categories, and outperforms prompt-only baselines in both quantitative metrics and human preference.
arXiv Detail & Related papers (2026-01-12T18:59:32Z) - IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning [13.89445714667069]
IC-Effect is an instruction-guided computation framework for few-shot video VFX editing.<n>It synthesizes complex effects while preserving spatial and temporal consistency.<n>A two-stage training strategy, consisting of general editing adaptation followed by effect-specific learning, ensures strong instruction following and robust effect modeling.
arXiv Detail & Related papers (2025-12-17T17:47:18Z) - Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation [11.41864836442447]
We propose Omni-Effects, a framework capable of generating prompt-guided effects and spatially controllable composite effects.<n>LoRA-based Mixture of Experts (LoRA-MoE) employs a group of expert LoRAs, integrating diverse effects within a unified model.<n> spatial-Aware Prompt (SAP) incorporates spatial mask information into the text token, enabling precise spatial control.
arXiv Detail & Related papers (2025-08-11T13:41:24Z) - Pre-Trained Video Generative Models as World Simulators [59.546627730477454]
We propose Dynamic World Simulation (DWS) to transform pre-trained video generative models into controllable world simulators.<n>To achieve precise alignment between conditioned actions and generated visual changes, we introduce a lightweight, universal action-conditioned module.<n> Experiments demonstrate that DWS can be versatilely applied to both diffusion and autoregressive transformer models.
arXiv Detail & Related papers (2025-02-10T14:49:09Z) - VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer [56.81599836980222]
We propose a novel paradigm for animated VFX generation as image animation, where dynamic effects are generated from user-friendly textual descriptions and static reference images.<n>Our work makes two primary contributions: (i) Open-VFX, the first high-quality VFX video dataset spanning 15 diverse effect categories, annotated with textual descriptions, and start-end timestamps for temporal control, and (ii) VFX Creator, a controllable VFX generation framework based on a Video Diffusion Transformer.
arXiv Detail & Related papers (2025-02-09T18:12:25Z) - Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics [79.4785166021062]
We introduce Puppet-Master, an interactive video generator that captures the internal, part-level motion of objects.<n>We demonstrate that Puppet-Master learns to generate part-level motions, unlike other motion-conditioned video generators.<n>Puppet-Master generalizes well to out-of-domain real images, outperforming existing methods on real-world benchmarks.
arXiv Detail & Related papers (2024-08-08T17:59:38Z) - ToonCrafter: Generative Cartoon Interpolation [63.52353451649143]
We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video.
ToonCrafter effectively addresses the challenges faced when applying live-action video motion priors to generative cartoon.
Experimental results demonstrate that our proposed method not only produces visually convincing and more natural dynamics, but also effectively handles dis-occlusion.
arXiv Detail & Related papers (2024-05-28T07:58:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.