Deep Sketch-guided Cartoon Video Inbetweening
- URL: http://arxiv.org/abs/2008.04149v2
- Date: Mon, 18 Jan 2021 17:15:39 GMT
- Title: Deep Sketch-guided Cartoon Video Inbetweening
- Authors: Xiaoyu Li, Bo Zhang, Jing Liao and Pedro V. Sander
- Abstract summary: We propose a framework to produce cartoon videos by fetching the color information from two inputs while following the animated motion guided by a user sketch.
By explicitly considering the correspondence between frames and the sketch, we can achieve higher quality results than other image synthesis methods.
- Score: 24.00033622396297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel framework to produce cartoon videos by fetching the color
information from two input keyframes while following the animated motion guided
by a user sketch. The key idea of the proposed approach is to estimate the
dense cross-domain correspondence between the sketch and cartoon video frames,
and employ a blending module with occlusion estimation to synthesize the middle
frame guided by the sketch. After that, the input frames and the synthetic
frame equipped with established correspondence are fed into an arbitrary-time
frame interpolation pipeline to generate and refine additional inbetween
frames. Finally, a module to preserve temporal consistency is employed.
Compared to common frame interpolation methods, our approach can address frames
with relatively large motion and also has the flexibility to enable users to
control the generated video sequences by editing the sketch guidance. By
explicitly considering the correspondence between frames and the sketch, we can
achieve higher quality results than other image synthesis methods. Our results
show that our system generalizes well to different movie frames, achieving
better results than existing solutions.
Related papers
- Framer: Interactive Frame Interpolation [73.06734414930227]
Framer targets producing smoothly transitioning frames between two images as per user creativity.
Our approach supports customizing the transition process by tailoring the trajectory of some selected keypoints.
It is noteworthy that our system also offers an "autopilot" mode, where we introduce a module to estimate the keypoints and the trajectory automatically.
arXiv Detail & Related papers (2024-10-24T17:59:51Z) - Sketch Video Synthesis [52.134906766625164]
We propose a novel framework for sketching videos represented by the frame-wise B'ezier curve.
Our method unlocks applications in sketch-based video editing and video doodling, enabled through video composition.
arXiv Detail & Related papers (2023-11-26T14:14:04Z) - Bridging the Gap: Sketch-Aware Interpolation Network for High-Quality Animation Sketch Inbetweening [58.09847349781176]
We propose a novel deep learning method - Sketch-Aware Interpolation Network (SAIN)
This approach incorporates multi-level guidance that formulates region-level correspondence, stroke-level correspondence and pixel-level dynamics.
A multi-stream U-Transformer is then devised to characterize sketch inbetweening patterns using these multi-level guides through the integration of self / cross-attention mechanisms.
arXiv Detail & Related papers (2023-08-25T09:51:03Z) - Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation [93.18163456287164]
This paper proposes a novel text-guided video-to-video translation framework to adapt image models to videos.
Our framework achieves global style and local texture temporal consistency at a low cost.
arXiv Detail & Related papers (2023-06-13T17:52:23Z) - TTVFI: Learning Trajectory-Aware Transformer for Video Frame
Interpolation [50.49396123016185]
Video frame (VFI) aims to synthesize an intermediate frame between two consecutive frames.
We propose a novel Trajectory-aware Transformer for Video Frame Interpolation (TTVFI)
Our method outperforms other state-of-the-art methods in four widely-used VFI benchmarks.
arXiv Detail & Related papers (2022-07-19T03:37:49Z) - ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring [92.40655035360729]
Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions.
We propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space.
Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) dataset for Video Deblurring.
arXiv Detail & Related papers (2021-03-07T04:33:13Z) - ALANET: Adaptive Latent Attention Network forJoint Video Deblurring and
Interpolation [38.52446103418748]
We introduce a novel architecture, Adaptive Latent Attention Network (ALANET), which synthesizes sharp high frame-rate videos.
We employ combination of self-attention and cross-attention module between consecutive frames in the latent space to generate optimized representation for each frame.
Our method performs favorably against various state-of-the-art approaches, even though we tackle a much more difficult problem.
arXiv Detail & Related papers (2020-08-31T21:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.