Related papers: Diffusion Model-Based Video Editing: A Survey

Diffusion Model-Based Video Editing: A Survey

URL: http://arxiv.org/abs/2407.07111v1
Date: Wed, 26 Jun 2024 04:58:39 GMT
Title: Diffusion Model-Based Video Editing: A Survey
Authors: Wenhao Sun, Rong-Cheng Tu, Jingyi Liao, Dacheng Tao,
Abstract summary: This paper reviews diffusion model-based video editing techniques, including theoretical foundations and practical applications. We categorize video editing approaches by the inherent connections of their core technologies, depicting evolutionary trajectory. This paper also dives into novel applications, including point-based editing and pose-guided human video editing.
Score: 47.45047496559506
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid development of diffusion models (DMs) has significantly advanced image and video applications, making "what you want is what you see" a reality. Among these, video editing has gained substantial attention and seen a swift rise in research activity, necessitating a comprehensive and systematic review of the existing literature. This paper reviews diffusion model-based video editing techniques, including theoretical foundations and practical applications. We begin by overviewing the mathematical formulation and image domain's key methods. Subsequently, we categorize video editing approaches by the inherent connections of their core technologies, depicting evolutionary trajectory. This paper also dives into novel applications, including point-based editing and pose-guided human video editing. Additionally, we present a comprehensive comparison using our newly introduced V2VBench. Building on the progress achieved to date, the paper concludes with ongoing challenges and potential directions for future research.

Related papers

Edit as You See: Image-guided Video Editing via Masked Motion Modeling [18.89936405508778]
We propose a novel Image-guided Video Editing Diffusion model, termed IVEDiff. IVEDiff is built on top of image editing models, and is equipped with learnable motion modules to maintain the temporal consistency of edited video. Our method is able to generate temporally smooth edited videos while robustly dealing with various editing objects with high quality.
arXiv Detail & Related papers (2025-01-08T07:52:12Z)
StableV2V: Stablizing Shape Consistency in Video-to-Video Editing [11.09708780767668]
We present a shape-consistent video editing method, namely StableV2V, in this paper. Our method decomposes the entire editing pipeline into several sequential procedures, where it edits the first video frame, then establishes an alignment between the delivered motions and user prompts, and eventually propagates the edited contents to all other frames based on such alignment. Experimental results and analyses illustrate the outperforming performance, visual consistency, and inference efficiency of our method compared to existing state-of-the-art studies.
arXiv Detail & Related papers (2024-11-17T11:48:01Z)
Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era [50.19334853510935]
Recent strides in instruction-based editing have enabled intuitive interaction with visual content, using natural language as a bridge between user intent and complex editing operations. We aim to democratize powerful visual editing across various industries, from entertainment to education.
arXiv Detail & Related papers (2024-11-15T05:18:15Z)
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models [117.77807994397784]
Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. Recent significant advancement in this field is based on the development of text-to-image (T2I) diffusion models. T2I-based image editing methods significantly enhance editing performance and offer a user-friendly interface for modifying content guided by multimodal inputs.
arXiv Detail & Related papers (2024-06-20T17:58:52Z)
Video Diffusion Models: A Survey [3.7985353171858045]
Diffusion generative models have recently become a powerful technique for creating and modifying high-quality, coherent video content. This survey provides an overview of the critical components of diffusion models for video generation, including their applications, architectural design, and temporal dynamics modeling.
arXiv Detail & Related papers (2024-05-06T04:01:42Z)
A Survey on Video Diffusion Models [103.03565844371711]
The recent wave of AI-generated content (AIGC) has witnessed substantial success in computer vision. Due to their impressive generative capabilities, diffusion models are gradually superseding methods based on GANs and auto-regressive Transformers. This paper presents a comprehensive review of video diffusion models in the AIGC era.
arXiv Detail & Related papers (2023-10-16T17:59:28Z)
State of the Art on Diffusion Models for Visual Computing [191.6168813012954]
This report introduces the basic mathematical concepts of diffusion models, implementation details and design choices of the popular Stable Diffusion model. We also give a comprehensive overview of the rapidly growing literature on diffusion-based generation and editing. We discuss available datasets, metrics, open challenges, and social implications.
arXiv Detail & Related papers (2023-10-11T05:32:29Z)
Dreamix: Video Diffusion Models are General Video Editors [22.127604561922897]
Text-driven image and video diffusion models have recently achieved unprecedented generation realism. We present the first diffusion-based method that is able to perform text-based motion and appearance editing of general videos.
arXiv Detail & Related papers (2023-02-02T18:58:58Z)
The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing [90.59584961661345]
This work introduces the Anatomy of Video Editing, a dataset, and benchmark, to foster research in AI-assisted video editing. Our benchmark suite focuses on video editing tasks, beyond visual effects, such as automatic footage organization and assisted video assembling. To enable research on these fronts, we annotate more than 1.5M tags, with relevant concepts to cinematography, from 196176 shots sampled from movie scenes.
arXiv Detail & Related papers (2022-07-20T10:53:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.