Sketch Me A Video
- URL: http://arxiv.org/abs/2110.04710v1
- Date: Sun, 10 Oct 2021 05:40:11 GMT
- Title: Sketch Me A Video
- Authors: Haichao Zhang, Gang Yu, Tao Chen, Guozhong Luo
- Abstract summary: We introduce a new video synthesis task by employing two rough bad-drwan sketches only as input to create a realistic portrait video.
A two-stage Sketch-to-Video model is proposed, which consists of two key novelties.
- Score: 32.38205496481408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video creation has been an attractive yet challenging task for artists to
explore. With the advancement of deep learning, recent works try to utilize
deep convolutional neural networks to synthesize a video with the aid of a
guiding video, and have achieved promising results. However, the acquisition of
guiding videos, or other forms of guiding temporal information is costly
expensive and difficult in reality. Therefore, in this work we introduce a new
video synthesis task by employing two rough bad-drwan sketches only as input to
create a realistic portrait video. A two-stage Sketch-to-Video model is
proposed, which consists of two key novelties: 1) a feature retrieve and
projection (FRP) module, which parititions the input sketch into different
parts and utilizes these parts for synthesizing a realistic start or end frame
and meanwhile generating rich semantic features, is designed to alleviate the
sketch out-of-domain problem due to arbitrarily drawn free-form sketch styles
by different users. 2) A motion projection followed by feature blending module,
which projects a video (used only in training phase) into a motion space
modeled by normal distribution and blends the motion variables with semantic
features extracted above, is proposed to alleviate the guiding temporal
information missing problem in the test phase. Experiments conducted on a
combination of CelebAMask-HQ and VoxCeleb2 dataset well validate that, our
method can acheive both good quantitative and qualitative results in
synthesizing high-quality videos from two rough bad-drawn sketches.
Related papers
- DreamVideo: Composing Your Dream Videos with Customized Subject and
Motion [52.7394517692186]
We present DreamVideo, a novel approach to generating personalized videos from a few static images of the desired subject.
DreamVideo decouples this task into two stages, subject learning and motion learning, by leveraging a pre-trained video diffusion model.
In motion learning, we architect a motion adapter and fine-tune it on the given videos to effectively model the target motion pattern.
arXiv Detail & Related papers (2023-12-07T16:57:26Z) - Sketch Video Synthesis [52.134906766625164]
We propose a novel framework for sketching videos represented by the frame-wise B'ezier curve.
Our method unlocks applications in sketch-based video editing and video doodling, enabled through video composition.
arXiv Detail & Related papers (2023-11-26T14:14:04Z) - Breathing Life Into Sketches Using Text-to-Video Priors [101.8236605955899]
A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually.
In this work, we present a method that automatically adds motion to a single-subject sketch.
The output is a short animation provided in vector representation, which can be easily edited.
arXiv Detail & Related papers (2023-11-21T18:09:30Z) - Render In-between: Motion Guided Video Synthesis for Action
Interpolation [53.43607872972194]
We propose a motion-guided frame-upsampling framework that is capable of producing realistic human motion and appearance.
A novel motion model is trained to inference the non-linear skeletal motion between frames by leveraging a large-scale motion-capture dataset.
Our pipeline only requires low-frame-rate videos and unpaired human motion data but does not require high-frame-rate videos for training.
arXiv Detail & Related papers (2021-11-01T15:32:51Z) - Compositional Video Synthesis with Action Graphs [112.94651460161992]
Videos of actions are complex signals containing rich compositional structure in space and time.
We propose to represent the actions in a graph structure called Action Graph and present the new Action Graph To Video'' synthesis task.
Our generative model for this task (AG2Vid) disentangles motion and appearance features, and by incorporating a scheduling mechanism for actions facilitates a timely and coordinated video generation.
arXiv Detail & Related papers (2020-06-27T09:39:04Z) - Fine-Grained Instance-Level Sketch-Based Video Retrieval [159.12935292432743]
We propose a novel cross-modal retrieval problem of fine-grained instance-level sketch-based video retrieval (FG-SBVR)
Compared with sketch-based still image retrieval, and coarse-grained category-level video retrieval, this is more challenging as both visual appearance and motion need to be simultaneously matched at a fine-grained level.
We show that this model significantly outperforms a number of existing state-of-the-art models designed for video analysis.
arXiv Detail & Related papers (2020-02-21T18:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.