Related papers: Instance-guided Cartoon Editing with a Large-scale Dataset

Instance-guided Cartoon Editing with a Large-scale Dataset

URL: http://arxiv.org/abs/2312.01943v1
Date: Mon, 4 Dec 2023 15:00:15 GMT
Title: Instance-guided Cartoon Editing with a Large-scale Dataset
Authors: Jian Lin, Chengze Li, Xueting Liu and Zhongping Ge
Abstract summary: We present an instance-aware image segmentation model that can generate accurate, high-resolution segmentation masks for characters in cartoon images. We present that the proposed approach enables a range of segmentation-dependent cartoon editing applications like 3D Ken Burns parallax effects, text-guided cartoon style editing, and puppet animation from illustrations and manga.
Score: 12.955181769243232
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Cartoon editing, appreciated by both professional illustrators and hobbyists, allows extensive creative freedom and the development of original narratives within the cartoon domain. However, the existing literature on cartoon editing is complex and leans heavily on manual operations, owing to the challenge of automatic identification of individual character instances. Therefore, an automated segmentation of these elements becomes imperative to facilitate a variety of cartoon editing applications such as visual style editing, motion decomposition and transfer, and the computation of stereoscopic depths for an enriched visual experience. Unfortunately, most current segmentation methods are designed for natural photographs, failing to recognize from the intricate aesthetics of cartoon subjects, thus lowering segmentation quality. The major challenge stems from two key shortcomings: the rarity of high-quality cartoon dedicated datasets and the absence of competent models for high-resolution instance extraction on cartoons. To address this, we introduce a high-quality dataset of over 100k paired high-resolution cartoon images and their instance labeling masks. We also present an instance-aware image segmentation model that can generate accurate, high-resolution segmentation masks for characters in cartoon images. We present that the proposed approach enables a range of segmentation-dependent cartoon editing applications like 3D Ken Burns parallax effects, text-guided cartoon style editing, and puppet animation from illustrations and manga.

Related papers

Sakuga-42M Dataset: Scaling Up Cartoon Research [4.676528353567339]
Sakuga-42M comprises 42 milliontexts covering various artistic styles, regions, and years, with comprehensive semantic annotations. Our motivation is to introduce large-scaling to cartoon research and foster generalization and robustness in future cartoon applications.
arXiv Detail & Related papers (2024-05-13T01:50:05Z)
AniClipart: Clipart Animation with Text-to-Video Priors [28.76809141136148]
We introduce AniClipart, a system that transforms static images into high-quality motion sequences guided by text-to-video priors. Experimental results show that the proposed AniClipart consistently outperforms existing image-to-video generation models.
arXiv Detail & Related papers (2024-04-18T17:24:28Z)
AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment [64.02822911038848]
We present AnimateZoo, a zero-shot diffusion-based video generator to produce animal animations. Key technique used in our AnimateZoo is subject alignment, which includes two steps. Our model is capable of generating videos characterized by accurate movements, consistent appearance, and high-fidelity frames.
arXiv Detail & Related papers (2024-04-07T12:57:41Z)
Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models [5.830731563895666]
We present CartoonDiff, a novel training-free sampling approach which generates image cartoonization using diffusion transformer models. We implement the image cartoonization process by normalizing high-frequency signal of the noisy image in specific denoising steps.
arXiv Detail & Related papers (2023-09-15T08:55:59Z)
Interactive Cartoonization with Controllable Perceptual Factors [5.8641445422054765]
We propose a novel solution with editing features of texture and color based on the cartoon creation process. In the texture decoder, we propose a texture controller, which enables a user to control stroke style and abstraction to generate diverse cartoon textures. We also introduce an HSV color augmentation to induce the networks to generate diverse and controllable color translation.
arXiv Detail & Related papers (2022-12-19T15:45:47Z)
Learning to Incorporate Texture Saliency Adaptive Attention to Image Cartoonization [20.578335938736384]
A novel cartoon-texture-saliency-sampler (CTSS) module is proposed to dynamically sample cartoon-texture-salient patches from training data. With extensive experiments, we demonstrate that texture saliency adaptive attention in adversarial learning, is of significant importance in facilitating and enhancing image cartoonization.
arXiv Detail & Related papers (2022-08-02T16:45:55Z)
Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency [89.75731026852338]
We propose a spatially-adaptive alignment framework with perceptual motion consistency for coherent video cartoonization. We devise the semantic correlative map as a style-independent, global-aware regularization on the perceptual consistency motion. Our method is able to generate highly stylistic and temporal consistent cartoon videos.
arXiv Detail & Related papers (2022-04-02T07:59:02Z)
DoodleFormer: Creative Sketch Drawing with Transformers [68.18953603715514]
Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder.
arXiv Detail & Related papers (2021-12-06T18:59:59Z)
The Animation Transformer: Visual Correspondence via Segment Matching [2.8387322144750726]
Animation Transformer (AnT) uses a transformer-based architecture to learn the spatial and visual relationships between segments across a sequence of images. AnT enables practical ML-assisted colorization for professional animation and is publicly accessible as a creative tool in Cadmium.
arXiv Detail & Related papers (2021-09-06T17:23:40Z)
Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time. We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner. Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z)
Learning to Caricature via Semantic Shape Transform [95.25116681761142]
We propose an algorithm based on a semantic shape transform to produce shape exaggerations. We show that the proposed framework is able to render visually pleasing shape exaggerations while maintaining their facial structures.
arXiv Detail & Related papers (2020-08-12T03:41:49Z)
Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches [133.01690754567252]
Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches. Deep Plastic Surgery is a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs.
arXiv Detail & Related papers (2020-01-09T08:57:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.