ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
- URL: http://arxiv.org/abs/2508.10881v1
- Date: Thu, 14 Aug 2025 17:50:11 GMT
- Title: ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
- Authors: Lingen Li, Guangzhi Wang, Zhaoyang Zhang, Yaowei Li, Xiaoyu Li, Qi Dou, Jinwei Gu, Tianfan Xue, Ying Shan,
- Abstract summary: ToonComposer is a generative model that unifies inbetweening and colorization into a single post-keyframing stage.<n>Requiring as few as a single sketch and a colored reference frame, ToonComposer excels with sparse inputs.<n>Our evaluation demonstrates that ToonComposer outperforms existing methods in visual quality, motion consistency, and production efficiency.
- Score: 60.81602269917522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional cartoon and anime production involves keyframing, inbetweening, and colorization stages, which require intensive manual effort. Despite recent advances in AI, existing methods often handle these stages separately, leading to error accumulation and artifacts. For instance, inbetweening approaches struggle with large motions, while colorization methods require dense per-frame sketches. To address this, we introduce ToonComposer, a generative model that unifies inbetweening and colorization into a single post-keyframing stage. ToonComposer employs a sparse sketch injection mechanism to provide precise control using keyframe sketches. Additionally, it uses a cartoon adaptation method with the spatial low-rank adapter to tailor a modern video foundation model to the cartoon domain while keeping its temporal prior intact. Requiring as few as a single sketch and a colored reference frame, ToonComposer excels with sparse inputs, while also supporting multiple sketches at any temporal location for more precise motion control. This dual capability reduces manual workload and improves flexibility, empowering artists in real-world scenarios. To evaluate our model, we further created PKBench, a benchmark featuring human-drawn sketches that simulate real-world use cases. Our evaluation demonstrates that ToonComposer outperforms existing methods in visual quality, motion consistency, and production efficiency, offering a superior and more flexible solution for AI-assisted cartoon production.
Related papers
- VideoSketcher: Video Models Prior Enable Versatile Sequential Sketch Generation [73.23035143627598]
Most generative models treat sketches as static images, overlooking the temporal structure that underlies creative drawing.<n>We present a data-efficient approach for sequential sketch generation that adapts pretrained text-to-video diffusion models.<n>Our method generates high-quality sketches that closely follow text-specified orderings while exhibiting rich visual detail.
arXiv Detail & Related papers (2026-02-17T18:55:03Z) - See-through: Single-image Layer Decomposition for Anime Characters [11.629918493740263]
We introduce a framework that automates the transformation of static anime illustrations into manipulatable 2.5D models.<n>Our approach overcomes this by decomposing a single image into fully inpainted, semantically distinct layers with inferred drawing orders.<n>We demonstrate that our approach yields high-fidelity, manipulatable models suitable for professional, real-time animation applications.
arXiv Detail & Related papers (2026-02-03T17:12:36Z) - SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation [57.47730473674261]
We introduce SwiftSketch, a model for image-conditioned vector sketch generation that can produce high-quality sketches in less than a second.<n>SwiftSketch operates by progressively denoising stroke control points sampled from a Gaussian distribution.<n>ControlSketch is a method that enhances SDS-based techniques by incorporating precise spatial control through a depth-aware ControlNet.
arXiv Detail & Related papers (2025-02-12T18:57:12Z) - AniDoc: Animation Creation Made Easier [54.97341104616779]
Our research focuses on reducing the labor costs in the production of 2D animation by harnessing the potential of increasingly powerful AI.<n>AniDoc emerges as a video line art colorization tool, which automatically converts sketch sequences into colored animations.<n>Our model exploits correspondence matching as an explicit guidance, yielding strong robustness to the variations between the reference character and each line art frame.
arXiv Detail & Related papers (2024-12-18T18:59:59Z) - Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints [1.1510009152620668]
We propose an approach for animating a given input sketch based on a descriptive text prompt.<n>We leverage a pre-trained text-to-video diffusion model with SDS loss to guide the motion of the sketch's strokes.<n>Our method surpasses state-of-the-art performance in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2024-11-28T21:15:38Z) - ToonCrafter: Generative Cartoon Interpolation [63.52353451649143]
We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video.
ToonCrafter effectively addresses the challenges faced when applying live-action video motion priors to generative cartoon.
Experimental results demonstrate that our proposed method not only produces visually convincing and more natural dynamics, but also effectively handles dis-occlusion.
arXiv Detail & Related papers (2024-05-28T07:58:33Z) - Bridging the Gap: Sketch-Aware Interpolation Network for High-Quality Animation Sketch Inbetweening [58.09847349781176]
We propose a novel deep learning method - Sketch-Aware Interpolation Network (SAIN)
This approach incorporates multi-level guidance that formulates region-level correspondence, stroke-level correspondence and pixel-level dynamics.
A multi-stream U-Transformer is then devised to characterize sketch inbetweening patterns using these multi-level guides through the integration of self / cross-attention mechanisms.
arXiv Detail & Related papers (2023-08-25T09:51:03Z) - Improving the Perceptual Quality of 2D Animation Interpolation [37.04208600867858]
Traditional 2D animation is labor-intensive, often requiring animators to draw twelve illustrations per second of movement.
Lower framerates result in larger displacements and occlusions, discrete perceptual elements (e.g. lines and solid-color regions) pose difficulties for texture-oriented convolutional networks.
Previous work tried addressing these issues, but used unscalable methods and focused on pixel-perfect performance.
We build a scalable system more appropriately centered on perceptual quality for this artistic domain.
arXiv Detail & Related papers (2021-11-24T20:51:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.