AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion
Models
- URL: http://arxiv.org/abs/2303.11137v1
- Date: Mon, 20 Mar 2023 14:15:23 GMT
- Title: AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion
Models
- Authors: Yu Cao, Xiangqiao Meng, P.Y. Mok, Xueting Liu, Tong-Yee Lee, Ping Li
- Abstract summary: We propose a novel method called AnimeDiffusion using diffusion models that performs anime face line drawing colorization automatically.
We conduct an anime face line drawing colorization benchmark dataset, which contains 31696 training data and 579 testing data.
We demonstrate AnimeDiffusion outperforms state-of-the-art GANs-based models for anime face drawing colorization.
- Score: 24.94532405404846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is a time-consuming and tedious work for manually colorizing anime line
drawing images, which is an essential stage in cartoon animation creation
pipeline. Reference-based line drawing colorization is a challenging task that
relies on the precise cross-domain long-range dependency modelling between the
line drawing and reference image. Existing learning methods still utilize
generative adversarial networks (GANs) as one key module of their model
architecture. In this paper, we propose a novel method called AnimeDiffusion
using diffusion models that performs anime face line drawing colorization
automatically. To the best of our knowledge, this is the first diffusion model
tailored for anime content creation. In order to solve the huge training
consumption problem of diffusion models, we design a hybrid training strategy,
first pre-training a diffusion model with classifier-free guidance and then
fine-tuning it with image reconstruction guidance. We find that with a few
iterations of fine-tuning, the model shows wonderful colorization performance,
as illustrated in Fig. 1. For training AnimeDiffusion, we conduct an anime face
line drawing colorization benchmark dataset, which contains 31696 training data
and 579 testing data. We hope this dataset can fill the gap of no available
high resolution anime face dataset for colorization method evaluation. Through
multiple quantitative metrics evaluated on our dataset and a user study, we
demonstrate AnimeDiffusion outperforms state-of-the-art GANs-based models for
anime face line drawing colorization. We also collaborate with professional
artists to test and apply our AnimeDiffusion for their creation work. We
release our code on https://github.com/xq-meng/AnimeDiffusion.
Related papers
- FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - Learning Inclusion Matching for Animation Paint Bucket Colorization [76.4507878427755]
We introduce a new learning-based inclusion matching pipeline, which directs the network to comprehend the inclusion relationships between segments.
Our method features a two-stage pipeline that integrates a coarse color warping module with an inclusion matching module.
To facilitate the training of our network, we also develope a unique dataset, referred to as PaintBucket-Character.
arXiv Detail & Related papers (2024-03-27T08:32:48Z) - Improving Diffusion Models for Virtual Try-on [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - APISR: Anime Production Inspired Real-World Anime Super-Resolution [15.501488335115269]
We argue that video networks and datasets are not necessary for anime SR due to the repetition use of hand-drawing frames.
Instead, we propose an anime image collection pipeline by choosing the least compressed and the most informative frames from the video sources.
We evaluate our method through extensive experiments on the public benchmark, showing our method outperforms state-of-the-art anime dataset-trained approaches.
arXiv Detail & Related papers (2024-03-03T19:52:43Z) - Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models [25.903156244291168]
Toon shading is a type of non-photorealistic rendering task of animation.
Diffutoon is capable of rendering remarkably detailed, high-resolution, and extended-duration videos in anime style.
arXiv Detail & Related papers (2024-01-29T15:21:37Z) - Fine-Tuning InstructPix2Pix for Advanced Image Colorization [3.4975669723257035]
This paper presents a novel approach to human image colorization by fine-tuning the InstructPix2Pix model.
We fine-tune the model using the IMDB-WIKI dataset, pairing black-and-white images with a diverse set of colorization prompts generated by ChatGPT.
After finetuning, our model outperforms the original InstructPix2Pix model on multiple metrics quantitatively.
arXiv Detail & Related papers (2023-12-08T01:36:49Z) - Deep Geometrized Cartoon Line Inbetweening [98.35956631655357]
Inbetweening involves generating intermediate frames between two black-and-white line drawings.
Existing frame methods that rely on matching and warping whole images are unsuitable for line inbetweening.
We propose AnimeInbet, which geometrizes geometric line drawings into endpoints and reframes the inbetweening task as a graph fusion problem.
Our method can effectively capture the sparsity and unique structure of line drawings while preserving the details during inbetweening.
arXiv Detail & Related papers (2023-09-28T17:50:05Z) - Improved Diffusion-based Image Colorization via Piggybacked Models [19.807766482434563]
We introduce a colorization model piggybacking on the existing powerful T2I diffusion model.
A diffusion guider is designed to incorporate the pre-trained weights of the latent diffusion model.
A lightness-aware VQVAE will then generate the colorized result with pixel-perfect alignment to the given grayscale image.
arXiv Detail & Related papers (2023-04-21T16:23:24Z) - Learning 3D Photography Videos via Self-supervised Diffusion on Single
Images [105.81348348510551]
3D photography renders a static image into a video with appealing 3D visual effects.
Existing approaches typically first conduct monocular depth estimation, then render the input frame to subsequent frames with various viewpoints.
We present a novel task: out-animation, which extends the space and time of input objects.
arXiv Detail & Related papers (2023-02-21T16:18:40Z) - Diffusion Art or Digital Forgery? Investigating Data Replication in
Diffusion Models [53.03978584040557]
We study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated.
Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication.
arXiv Detail & Related papers (2022-12-07T18:58:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.