AnimeColor: Reference-based Animation Colorization with Diffusion Transformers
- URL: http://arxiv.org/abs/2507.20158v1
- Date: Sun, 27 Jul 2025 07:25:08 GMT
- Title: AnimeColor: Reference-based Animation Colorization with Diffusion Transformers
- Authors: Yuhong Zhang, Liyao Wang, Han Wang, Danni Wu, Zuzeng Lin, Feng Wang, Li Song,
- Abstract summary: Animation colorization plays a vital role in animation production, yet existing methods struggle to achieve color accuracy and temporal consistency.<n>We propose textbfAnimeColor, a novel reference-based animation colorization framework leveraging Diffusion Transformers (DiT)<n>Our approach integrates sketch sequences into a DiT-based video diffusion model, enabling sketch-controlled animation generation.
- Score: 9.64847784171945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Animation colorization plays a vital role in animation production, yet existing methods struggle to achieve color accuracy and temporal consistency. To address these challenges, we propose \textbf{AnimeColor}, a novel reference-based animation colorization framework leveraging Diffusion Transformers (DiT). Our approach integrates sketch sequences into a DiT-based video diffusion model, enabling sketch-controlled animation generation. We introduce two key components: a High-level Color Extractor (HCE) to capture semantic color information and a Low-level Color Guider (LCG) to extract fine-grained color details from reference images. These components work synergistically to guide the video diffusion process. Additionally, we employ a multi-stage training strategy to maximize the utilization of reference image color information. Extensive experiments demonstrate that AnimeColor outperforms existing methods in color accuracy, sketch alignment, temporal consistency, and visual quality. Our framework not only advances the state of the art in animation colorization but also provides a practical solution for industrial applications. The code will be made publicly available at \href{https://github.com/IamCreateAI/AnimeColor}{https://github.com/IamCreateAI/AnimeColor}.
Related papers
- Video Color Grading via Look-Up Table Generation [38.14578948732577]
In this paper, we present a reference-based video color grading framework.<n>Our key idea is explicitly generating a look-up table (LUT) for color attribute alignment between reference scenes and input video.<n>As a training objective, we enforce that high-level features of the reference scenes like look, mood, and emotion should be similar to that of the input video.
arXiv Detail & Related papers (2025-08-01T11:43:30Z) - SketchColour: Channel Concat Guided DiT-based Sketch-to-Colour Pipeline for 2D Animation [7.2542954248246305]
We present SketchColour, the first sketch-to-colour pipeline for 2D animation built on a diffusion transformer (DiT) backbone.<n>We replace the conventional U-Net denoiser with a DiT-style architecture and injecting sketch information via lightweight channel-concatenation adapters.<n>Our approach produces temporally coherent animations with minimal artifacts such as colour bleeding or object deformation.
arXiv Detail & Related papers (2025-07-02T10:57:16Z) - Image Referenced Sketch Colorization Based on Animation Creation Workflow [28.281739343084993]
We propose a diffusion-based framework inspired by real-world animation production.<n>Our approach leverages the sketch as the spatial guidance and an RGB image as the color reference, and separately extracts foreground and background from the reference image with masks.<n>This design allows the diffusion model to integrate information from foreground and background independently, preventing interference and eliminating the spatial artifacts.
arXiv Detail & Related papers (2025-02-27T10:04:47Z) - AniDoc: Animation Creation Made Easier [54.97341104616779]
Our research focuses on reducing the labor costs in the production of 2D animation by harnessing the potential of increasingly powerful AI.<n>AniDoc emerges as a video line art colorization tool, which automatically converts sketch sequences into colored animations.<n>Our model exploits correspondence matching as an explicit guidance, yielding strong robustness to the variations between the reference character and each line art frame.
arXiv Detail & Related papers (2024-12-18T18:59:59Z) - Paint Bucket Colorization Using Anime Character Color Design Sheets [72.66788521378864]
We introduce inclusion matching, which allows the network to understand the relationships between segments.
Our network's training pipeline significantly improves performance in both colorization and consecutive frame colorization.
To support our network's training, we have developed a unique dataset named PaintBucket-Character.
arXiv Detail & Related papers (2024-10-25T09:33:27Z) - Learning Inclusion Matching for Animation Paint Bucket Colorization [76.4507878427755]
We introduce a new learning-based inclusion matching pipeline, which directs the network to comprehend the inclusion relationships between segments.
Our method features a two-stage pipeline that integrates a coarse color warping module with an inclusion matching module.
To facilitate the training of our network, we also develope a unique dataset, referred to as PaintBucket-Character.
arXiv Detail & Related papers (2024-03-27T08:32:48Z) - Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z) - BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature
Fusion for Deep Exemplar-based Video Colorization [70.14893481468525]
We present an effective BiSTNet to explore colors of reference exemplars and utilize them to help video colorization.
We first establish the semantic correspondence between each frame and the reference exemplars in deep feature space to explore color information from reference exemplars.
We develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process.
arXiv Detail & Related papers (2022-12-05T13:47:15Z) - Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time.
We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner.
Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.