Instance-guided Cartoon Editing with a Large-scale Dataset
- URL: http://arxiv.org/abs/2312.01943v1
- Date: Mon, 4 Dec 2023 15:00:15 GMT
- Title: Instance-guided Cartoon Editing with a Large-scale Dataset
- Authors: Jian Lin, Chengze Li, Xueting Liu and Zhongping Ge
- Abstract summary: We present an instance-aware image segmentation model that can generate accurate, high-resolution segmentation masks for characters in cartoon images.
We present that the proposed approach enables a range of segmentation-dependent cartoon editing applications like 3D Ken Burns parallax effects, text-guided cartoon style editing, and puppet animation from illustrations and manga.
- Score: 12.955181769243232
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Cartoon editing, appreciated by both professional illustrators and hobbyists,
allows extensive creative freedom and the development of original narratives
within the cartoon domain. However, the existing literature on cartoon editing
is complex and leans heavily on manual operations, owing to the challenge of
automatic identification of individual character instances. Therefore, an
automated segmentation of these elements becomes imperative to facilitate a
variety of cartoon editing applications such as visual style editing, motion
decomposition and transfer, and the computation of stereoscopic depths for an
enriched visual experience. Unfortunately, most current segmentation methods
are designed for natural photographs, failing to recognize from the intricate
aesthetics of cartoon subjects, thus lowering segmentation quality. The major
challenge stems from two key shortcomings: the rarity of high-quality cartoon
dedicated datasets and the absence of competent models for high-resolution
instance extraction on cartoons. To address this, we introduce a high-quality
dataset of over 100k paired high-resolution cartoon images and their instance
labeling masks. We also present an instance-aware image segmentation model that
can generate accurate, high-resolution segmentation masks for characters in
cartoon images. We present that the proposed approach enables a range of
segmentation-dependent cartoon editing applications like 3D Ken Burns parallax
effects, text-guided cartoon style editing, and puppet animation from
illustrations and manga.
Related papers
- Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters [86.13319549186959]
We present Make-It-Animatable, a novel data-driven method to make any 3D humanoid model ready for character animation in less than one second.
Our framework generates high-quality blend weights, bones, and pose transformations.
Compared to existing methods, our approach demonstrates significant improvements in both quality and speed.
arXiv Detail & Related papers (2024-11-27T10:18:06Z) - Sakuga-42M Dataset: Scaling Up Cartoon Research [4.676528353567339]
Sakuga-42M comprises 42 milliontexts covering various artistic styles, regions, and years, with comprehensive semantic annotations.
Our motivation is to introduce large-scaling to cartoon research and foster generalization and robustness in future cartoon applications.
arXiv Detail & Related papers (2024-05-13T01:50:05Z) - AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment [64.02822911038848]
We present AnimateZoo, a zero-shot diffusion-based video generator to produce animal animations.
Key technique used in our AnimateZoo is subject alignment, which includes two steps.
Our model is capable of generating videos characterized by accurate movements, consistent appearance, and high-fidelity frames.
arXiv Detail & Related papers (2024-04-07T12:57:41Z) - Cartoondiff: Training-free Cartoon Image Generation with Diffusion
Transformer Models [5.830731563895666]
We present CartoonDiff, a novel training-free sampling approach which generates image cartoonization using diffusion transformer models.
We implement the image cartoonization process by normalizing high-frequency signal of the noisy image in specific denoising steps.
arXiv Detail & Related papers (2023-09-15T08:55:59Z) - Interactive Cartoonization with Controllable Perceptual Factors [5.8641445422054765]
We propose a novel solution with editing features of texture and color based on the cartoon creation process.
In the texture decoder, we propose a texture controller, which enables a user to control stroke style and abstraction to generate diverse cartoon textures.
We also introduce an HSV color augmentation to induce the networks to generate diverse and controllable color translation.
arXiv Detail & Related papers (2022-12-19T15:45:47Z) - Learning to Incorporate Texture Saliency Adaptive Attention to Image
Cartoonization [20.578335938736384]
A novel cartoon-texture-saliency-sampler (CTSS) module is proposed to dynamically sample cartoon-texture-salient patches from training data.
With extensive experiments, we demonstrate that texture saliency adaptive attention in adversarial learning, is of significant importance in facilitating and enhancing image cartoonization.
arXiv Detail & Related papers (2022-08-02T16:45:55Z) - Unsupervised Coherent Video Cartoonization with Perceptual Motion
Consistency [89.75731026852338]
We propose a spatially-adaptive alignment framework with perceptual motion consistency for coherent video cartoonization.
We devise the semantic correlative map as a style-independent, global-aware regularization on the perceptual consistency motion.
Our method is able to generate highly stylistic and temporal consistent cartoon videos.
arXiv Detail & Related papers (2022-04-02T07:59:02Z) - The Animation Transformer: Visual Correspondence via Segment Matching [2.8387322144750726]
Animation Transformer (AnT) uses a transformer-based architecture to learn the spatial and visual relationships between segments across a sequence of images.
AnT enables practical ML-assisted colorization for professional animation and is publicly accessible as a creative tool in Cadmium.
arXiv Detail & Related papers (2021-09-06T17:23:40Z) - Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time.
We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner.
Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z) - Learning to Caricature via Semantic Shape Transform [95.25116681761142]
We propose an algorithm based on a semantic shape transform to produce shape exaggerations.
We show that the proposed framework is able to render visually pleasing shape exaggerations while maintaining their facial structures.
arXiv Detail & Related papers (2020-08-12T03:41:49Z) - Deep Plastic Surgery: Robust and Controllable Image Editing with
Human-Drawn Sketches [133.01690754567252]
Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches.
Deep Plastic Surgery is a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs.
arXiv Detail & Related papers (2020-01-09T08:57:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.