AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion
Models
- URL: http://arxiv.org/abs/2303.11137v1
- Date: Mon, 20 Mar 2023 14:15:23 GMT
- Title: AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion
Models
- Authors: Yu Cao, Xiangqiao Meng, P.Y. Mok, Xueting Liu, Tong-Yee Lee, Ping Li
- Abstract summary: We propose a novel method called AnimeDiffusion using diffusion models that performs anime face line drawing colorization automatically.
We conduct an anime face line drawing colorization benchmark dataset, which contains 31696 training data and 579 testing data.
We demonstrate AnimeDiffusion outperforms state-of-the-art GANs-based models for anime face drawing colorization.
- Score: 24.94532405404846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is a time-consuming and tedious work for manually colorizing anime line
drawing images, which is an essential stage in cartoon animation creation
pipeline. Reference-based line drawing colorization is a challenging task that
relies on the precise cross-domain long-range dependency modelling between the
line drawing and reference image. Existing learning methods still utilize
generative adversarial networks (GANs) as one key module of their model
architecture. In this paper, we propose a novel method called AnimeDiffusion
using diffusion models that performs anime face line drawing colorization
automatically. To the best of our knowledge, this is the first diffusion model
tailored for anime content creation. In order to solve the huge training
consumption problem of diffusion models, we design a hybrid training strategy,
first pre-training a diffusion model with classifier-free guidance and then
fine-tuning it with image reconstruction guidance. We find that with a few
iterations of fine-tuning, the model shows wonderful colorization performance,
as illustrated in Fig. 1. For training AnimeDiffusion, we conduct an anime face
line drawing colorization benchmark dataset, which contains 31696 training data
and 579 testing data. We hope this dataset can fill the gap of no available
high resolution anime face dataset for colorization method evaluation. Through
multiple quantitative metrics evaluated on our dataset and a user study, we
demonstrate AnimeDiffusion outperforms state-of-the-art GANs-based models for
anime face line drawing colorization. We also collaborate with professional
artists to test and apply our AnimeDiffusion for their creation work. We
release our code on https://github.com/xq-meng/AnimeDiffusion.
Related papers
- PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - Paint Bucket Colorization Using Anime Character Color Design Sheets [72.66788521378864]
We introduce inclusion matching, which allows the network to understand the relationships between segments.
Our network's training pipeline significantly improves performance in both colorization and consecutive frame colorization.
To support our network's training, we have developed a unique dataset named PaintBucket-Character.
arXiv Detail & Related papers (2024-10-25T09:33:27Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - Learning Inclusion Matching for Animation Paint Bucket Colorization [76.4507878427755]
We introduce a new learning-based inclusion matching pipeline, which directs the network to comprehend the inclusion relationships between segments.
Our method features a two-stage pipeline that integrates a coarse color warping module with an inclusion matching module.
To facilitate the training of our network, we also develope a unique dataset, referred to as PaintBucket-Character.
arXiv Detail & Related papers (2024-03-27T08:32:48Z) - APISR: Anime Production Inspired Real-World Anime Super-Resolution [15.501488335115269]
We argue that video networks and datasets are not necessary for anime SR due to the repetition use of hand-drawing frames.
Instead, we propose an anime image collection pipeline by choosing the least compressed and the most informative frames from the video sources.
We evaluate our method through extensive experiments on the public benchmark, showing our method outperforms state-of-the-art anime dataset-trained approaches.
arXiv Detail & Related papers (2024-03-03T19:52:43Z) - Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models [25.903156244291168]
Toon shading is a type of non-photorealistic rendering task of animation.
Diffutoon is capable of rendering remarkably detailed, high-resolution, and extended-duration videos in anime style.
arXiv Detail & Related papers (2024-01-29T15:21:37Z) - Fine-Tuning InstructPix2Pix for Advanced Image Colorization [3.4975669723257035]
This paper presents a novel approach to human image colorization by fine-tuning the InstructPix2Pix model.
We fine-tune the model using the IMDB-WIKI dataset, pairing black-and-white images with a diverse set of colorization prompts generated by ChatGPT.
After finetuning, our model outperforms the original InstructPix2Pix model on multiple metrics quantitatively.
arXiv Detail & Related papers (2023-12-08T01:36:49Z) - Deep Geometrized Cartoon Line Inbetweening [98.35956631655357]
Inbetweening involves generating intermediate frames between two black-and-white line drawings.
Existing frame methods that rely on matching and warping whole images are unsuitable for line inbetweening.
We propose AnimeInbet, which geometrizes geometric line drawings into endpoints and reframes the inbetweening task as a graph fusion problem.
Our method can effectively capture the sparsity and unique structure of line drawings while preserving the details during inbetweening.
arXiv Detail & Related papers (2023-09-28T17:50:05Z) - Improved Diffusion-based Image Colorization via Piggybacked Models [19.807766482434563]
We introduce a colorization model piggybacking on the existing powerful T2I diffusion model.
A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model.
A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
arXiv Detail & Related papers (2023-04-21T16:23:24Z) - Learning 3D Photography Videos via Self-supervised Diffusion on Single
Images [105.81348348510551]
3D photography renders a static image into a video with appealing 3D visual effects.
Existing approaches typically first conduct monocular depth estimation, then render the input frame to subsequent frames with various viewpoints.
We present a novel task: out-animation, which extends the space and time of input objects.
arXiv Detail & Related papers (2023-02-21T16:18:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.