Related papers: See-through: Single-image Layer Decomposition for Anime Characters

See-through: Single-image Layer Decomposition for Anime Characters

URL: http://arxiv.org/abs/2602.03749v1
Date: Tue, 03 Feb 2026 17:12:36 GMT
Title: See-through: Single-image Layer Decomposition for Anime Characters
Authors: Jian Lin, Chengze Li, Haoyun Qin, Kwun Wang Chan, Yanghua Jin, Hanyuan Liu, Stephen Chun Wang Choy, Xueting Liu,
Abstract summary: We introduce a framework that automates the transformation of static anime illustrations into manipulatable 2.5D models.<n>Our approach overcomes this by decomposing a single image into fully inpainted, semantically distinct layers with inferred drawing orders.<n>We demonstrate that our approach yields high-fidelity, manipulatable models suitable for professional, real-time animation applications.
Score: 11.629918493740263
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We introduce a framework that automates the transformation of static anime illustrations into manipulatable 2.5D models. Current professional workflows require tedious manual segmentation and the artistic ``hallucination'' of occluded regions to enable motion. Our approach overcomes this by decomposing a single image into fully inpainted, semantically distinct layers with inferred drawing orders. To address the scarcity of training data, we introduce a scalable engine that bootstraps high-quality supervision from commercial Live2D models, capturing pixel-perfect semantics and hidden geometry. Our methodology couples a diffusion-based Body Part Consistency Module, which enforces global geometric coherence, with a pixel-level pseudo-depth inference mechanism. This combination resolves the intricate stratification of anime characters, e.g., interleaving hair strands, allowing for dynamic layer reconstruction. We demonstrate that our approach yields high-fidelity, manipulatable models suitable for professional, real-time animation applications.

Related papers

StdGEN++: A Comprehensive System for Semantic-Decomposed 3D Character Generation [57.06461272772509]
StdGEN++ is a novel and comprehensive system for generating high-fidelity, semantically decomposed 3D characters from diverse inputs.<n>It achieves state-of-the-art performance, significantly outperforming existing methods in geometric accuracy and semantic disentanglement.<n>The resulting structural independence unlocks advanced downstream capabilities, including non-destructive editing, physics-compliant animation, and gaze tracking.
arXiv Detail & Related papers (2026-01-12T15:41:27Z)
LayerGS: Decomposition and Inpainting of Layered 3D Human Avatars via 2D Gaussian Splatting [0.7176107039687231]
We propose a novel framework for decomposing arbitrarily posed humans into animatable multi-layered 3D human avatars.<n>Our approach achieves better rendering quality and layer decomposition and recomposition than the previous state-of-the-art.
arXiv Detail & Related papers (2026-01-09T15:30:12Z)
Human Geometry Distribution for 3D Animation Generation [49.58025398670139]
We propose two novel designs to generate realistic human geometry animations.<n>First, we propose a compact distribution-based latent representation that enables efficient and high-quality geometry generation.<n>Second, we introduce a generative animation model that fully exploits the diversity of limited motion data.
arXiv Detail & Related papers (2025-12-08T11:35:16Z)
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing [60.81602269917522]
ToonComposer is a generative model that unifies inbetweening and colorization into a single post-keyframing stage.<n>Requiring as few as a single sketch and a colored reference frame, ToonComposer excels with sparse inputs.<n>Our evaluation demonstrates that ToonComposer outperforms existing methods in visual quality, motion consistency, and production efficiency.
arXiv Detail & Related papers (2025-08-14T17:50:11Z)
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses [57.17501809717155]
We present DreamDance, a novel method for animating human images using only skeleton pose sequences as conditional inputs.<n>Our key insight is that human images naturally exhibit multiple levels of correlation.<n>We construct the TikTok-Dance5K dataset, comprising 5K high-quality dance videos with detailed frame annotations.
arXiv Detail & Related papers (2024-11-30T08:42:13Z)
Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters [86.13319549186959]
We present Make-It-Animatable, a novel data-driven method to make any 3D humanoid model ready for character animation in less than one second.<n>Our framework generates high-quality blend weights, bones, and pose transformations.<n>Compared to existing methods, our approach demonstrates significant improvements in both quality and speed.
arXiv Detail & Related papers (2024-11-27T10:18:06Z)
Zero-shot High-fidelity and Pose-controllable Character Animation [89.74818983864832]
Image-to-video (I2V) generation aims to create a video sequence from a single image. Existing approaches suffer from inconsistency of character appearances and poor preservation of fine details. We propose PoseAnimate, a novel zero-shot I2V framework for character animation.
arXiv Detail & Related papers (2024-04-21T14:43:31Z)
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model [74.84435399451573]
This paper studies the human image animation task, which aims to generate a video of a certain reference identity following a particular motion sequence. Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion. We introduce MagicAnimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity.
arXiv Detail & Related papers (2023-11-27T18:32:31Z)
ObjectStitch: Generative Object Compositing [43.206123360578665]
We propose a self-supervised framework for object compositing using conditional diffusion models. Our framework can transform the viewpoint, geometry, color and shadow of the generated object while requiring no manual labeling. Our method outperforms relevant baselines in both realism and faithfulness of the synthesized result images in a user study on various real-world images.
arXiv Detail & Related papers (2022-12-02T02:15:13Z)
Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency [89.75731026852338]
We propose a spatially-adaptive alignment framework with perceptual motion consistency for coherent video cartoonization. We devise the semantic correlative map as a style-independent, global-aware regularization on the perceptual consistency motion. Our method is able to generate highly stylistic and temporal consistent cartoon videos.
arXiv Detail & Related papers (2022-04-02T07:59:02Z)
Improving the Perceptual Quality of 2D Animation Interpolation [37.04208600867858]
Traditional 2D animation is labor-intensive, often requiring animators to draw twelve illustrations per second of movement. Lower framerates result in larger displacements and occlusions, discrete perceptual elements (e.g. lines and solid-color regions) pose difficulties for texture-oriented convolutional networks. Previous work tried addressing these issues, but used unscalable methods and focused on pixel-perfect performance. We build a scalable system more appropriately centered on perceptual quality for this artistic domain.
arXiv Detail & Related papers (2021-11-24T20:51:29Z)
Going beyond Free Viewpoint: Creating Animatable Volumetric Video of Human Performances [7.7824496657259665]
We present an end-to-end pipeline for the creation of high-quality animatable volumetric video content of human performances. Semantic enrichment and geometric animation ability are achieved by establishing temporal consistency in the 3D data. For pose editing, we exploit the captured data as much as possible and kinematically deform the captured frames to fit a desired pose.
arXiv Detail & Related papers (2020-09-02T09:46:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.