AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data
- URL: http://arxiv.org/abs/2402.00769v3
- Date: Wed, 16 Oct 2024 04:53:55 GMT
- Title: AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data
- Authors: Fu-Yun Wang, Zhaoyang Huang, Weikang Bian, Xiaoyu Shi, Keqiang Sun, Guanglu Song, Yu Liu, Hongsheng Li,
- Abstract summary: It reduces the necessary generation time of similarly sized video diffusion models from 25 seconds to around 1 second.
The method's effectiveness lies in its dual-level decoupling learning approach.
- Score: 45.20627288830823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces an effective method for computation-efficient personalized style video generation without requiring access to any personalized video data. It reduces the necessary generation time of similarly sized video diffusion models from 25 seconds to around 1 second while maintaining the same level of performance. The method's effectiveness lies in its dual-level decoupling learning approach: 1) separating the learning of video style from video generation acceleration, which allows for personalized style video generation without any personalized style video data, and 2) separating the acceleration of image generation from the acceleration of video motion generation, enhancing training efficiency and mitigating the negative effects of low-quality video data.
Related papers
- Subject-driven Video Generation via Disentangled Identity and Motion [52.54835936914813]
We propose to train a subject-driven customized video generation model through decoupling the subject-specific learning from temporal dynamics in zero-shot without additional tuning.
Our method achieves strong subject consistency and scalability, outperforming existing video customization models in zero-shot settings.
arXiv Detail & Related papers (2025-04-23T06:48:31Z) - Whats in a Video: Factorized Autoregressive Decoding for Online Dense Video Captioning [71.94122309290537]
We propose an efficient, online approach to generate dense captions for videos.
Our model uses a novel autoregressive factorized decoding architecture.
Our approach shows excellent performance compared to both offline and online methods, and uses 20% less compute.
arXiv Detail & Related papers (2024-11-22T02:46:44Z) - Movie Gen: A Cast of Media Foundation Models [133.41504332082667]
We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio.
We show additional capabilities such as precise instruction-based video editing and generation of personalized videos based on a user's image.
arXiv Detail & Related papers (2024-10-17T16:22:46Z) - DreamVideo: Composing Your Dream Videos with Customized Subject and
Motion [52.7394517692186]
We present DreamVideo, a novel approach to generating personalized videos from a few static images of the desired subject.
DreamVideo decouples this task into two stages, subject learning and motion learning, by leveraging a pre-trained video diffusion model.
In motion learning, we architect a motion adapter and fine-tune it on the given videos to effectively model the target motion pattern.
arXiv Detail & Related papers (2023-12-07T16:57:26Z) - InstructVid2Vid: Controllable Video Editing with Natural Language Instructions [97.17047888215284]
InstructVid2Vid is an end-to-end diffusion-based methodology for video editing guided by human language instructions.
Our approach empowers video manipulation guided by natural language directives, eliminating the need for per-example fine-tuning or inversion.
arXiv Detail & Related papers (2023-05-21T03:28:13Z) - Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style
Transfer [13.098901971644656]
This paper proposes a zero-shot video stylization method named Style-A-Video.
Uses a generative pre-trained transformer with an image latent diffusion model to achieve a concise text-controlled video stylization.
Tests show that we can attain superior content preservation and stylistic performance while incurring less consumption than previous solutions.
arXiv Detail & Related papers (2023-05-09T14:03:27Z) - Video Generation Beyond a Single Clip [76.5306434379088]
Video generation models can only generate video clips that are relatively short compared with the length of real videos.
To generate long videos covering diverse content and multiple events, we propose to use additional guidance to control the video generation process.
The proposed approach is complementary to existing efforts on video generation, which focus on generating realistic video within a fixed time window.
arXiv Detail & Related papers (2023-04-15T06:17:30Z) - Diverse Generation from a Single Video Made Possible [24.39972895902724]
We present a fast and practical method for video generation and manipulation from a single natural video.
Our method generates more realistic and higher quality results than single-video GANs.
arXiv Detail & Related papers (2021-09-17T15:12:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.