Instance-guided Cartoon Editing with a Large-scale Dataset
- URL: http://arxiv.org/abs/2312.01943v1
- Date: Mon, 4 Dec 2023 15:00:15 GMT
- Title: Instance-guided Cartoon Editing with a Large-scale Dataset
- Authors: Jian Lin, Chengze Li, Xueting Liu and Zhongping Ge
- Abstract summary: We present an instance-aware image segmentation model that can generate accurate, high-resolution segmentation masks for characters in cartoon images.
We present that the proposed approach enables a range of segmentation-dependent cartoon editing applications like 3D Ken Burns parallax effects, text-guided cartoon style editing, and puppet animation from illustrations and manga.
- Score: 12.955181769243232
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Cartoon editing, appreciated by both professional illustrators and hobbyists,
allows extensive creative freedom and the development of original narratives
within the cartoon domain. However, the existing literature on cartoon editing
is complex and leans heavily on manual operations, owing to the challenge of
automatic identification of individual character instances. Therefore, an
automated segmentation of these elements becomes imperative to facilitate a
variety of cartoon editing applications such as visual style editing, motion
decomposition and transfer, and the computation of stereoscopic depths for an
enriched visual experience. Unfortunately, most current segmentation methods
are designed for natural photographs, failing to recognize from the intricate
aesthetics of cartoon subjects, thus lowering segmentation quality. The major
challenge stems from two key shortcomings: the rarity of high-quality cartoon
dedicated datasets and the absence of competent models for high-resolution
instance extraction on cartoons. To address this, we introduce a high-quality
dataset of over 100k paired high-resolution cartoon images and their instance
labeling masks. We also present an instance-aware image segmentation model that
can generate accurate, high-resolution segmentation masks for characters in
cartoon images. We present that the proposed approach enables a range of
segmentation-dependent cartoon editing applications like 3D Ken Burns parallax
effects, text-guided cartoon style editing, and puppet animation from
illustrations and manga.
Related papers
- Sakuga-42M Dataset: Scaling Up Cartoon Research [4.676528353567339]
Sakuga-42M comprises 42 milliontexts covering various artistic styles, regions, and years, with comprehensive semantic annotations.
Our motivation is to introduce large-scaling to cartoon research and foster generalization and robustness in future cartoon applications.
arXiv Detail & Related papers (2024-05-13T01:50:05Z) - AniClipart: Clipart Animation with Text-to-Video Priors [28.76809141136148]
We introduce AniClipart, a system that transforms static images into high-quality motion sequences guided by text-to-video priors.
Experimental results show that the proposed AniClipart consistently outperforms existing image-to-video generation models.
arXiv Detail & Related papers (2024-04-18T17:24:28Z) - AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment [64.02822911038848]
We present AnimateZoo, a zero-shot diffusion-based video generator to produce animal animations.
Key technique used in our AnimateZoo is subject alignment, which includes two steps.
Our model is capable of generating videos characterized by accurate movements, consistent appearance, and high-fidelity frames.
arXiv Detail & Related papers (2024-04-07T12:57:41Z) - Cartoondiff: Training-free Cartoon Image Generation with Diffusion
Transformer Models [5.830731563895666]
We present CartoonDiff, a novel training-free sampling approach which generates image cartoonization using diffusion transformer models.
We implement the image cartoonization process by normalizing high-frequency signal of the noisy image in specific denoising steps.
arXiv Detail & Related papers (2023-09-15T08:55:59Z) - Interactive Cartoonization with Controllable Perceptual Factors [5.8641445422054765]
We propose a novel solution with editing features of texture and color based on the cartoon creation process.
In the texture decoder, we propose a texture controller, which enables a user to control stroke style and abstraction to generate diverse cartoon textures.
We also introduce an HSV color augmentation to induce the networks to generate diverse and controllable color translation.
arXiv Detail & Related papers (2022-12-19T15:45:47Z) - Learning to Incorporate Texture Saliency Adaptive Attention to Image
Cartoonization [20.578335938736384]
A novel cartoon-texture-saliency-sampler (CTSS) module is proposed to dynamically sample cartoon-texture-salient patches from training data.
With extensive experiments, we demonstrate that texture saliency adaptive attention in adversarial learning, is of significant importance in facilitating and enhancing image cartoonization.
arXiv Detail & Related papers (2022-08-02T16:45:55Z) - Unsupervised Coherent Video Cartoonization with Perceptual Motion
Consistency [89.75731026852338]
We propose a spatially-adaptive alignment framework with perceptual motion consistency for coherent video cartoonization.
We devise the semantic correlative map as a style-independent, global-aware regularization on the perceptual consistency motion.
Our method is able to generate highly stylistic and temporal consistent cartoon videos.
arXiv Detail & Related papers (2022-04-02T07:59:02Z) - DoodleFormer: Creative Sketch Drawing with Transformers [68.18953603715514]
Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn.
Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition.
To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder.
arXiv Detail & Related papers (2021-12-06T18:59:59Z) - Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time.
We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner.
Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z) - Learning to Caricature via Semantic Shape Transform [95.25116681761142]
We propose an algorithm based on a semantic shape transform to produce shape exaggerations.
We show that the proposed framework is able to render visually pleasing shape exaggerations while maintaining their facial structures.
arXiv Detail & Related papers (2020-08-12T03:41:49Z) - Deep Plastic Surgery: Robust and Controllable Image Editing with
Human-Drawn Sketches [133.01690754567252]
Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches.
Deep Plastic Surgery is a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs.
arXiv Detail & Related papers (2020-01-09T08:57:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.