Improving the Perceptual Quality of 2D Animation Interpolation
- URL: http://arxiv.org/abs/2111.12792v1
- Date: Wed, 24 Nov 2021 20:51:29 GMT
- Title: Improving the Perceptual Quality of 2D Animation Interpolation
- Authors: Shuhong Chen, Matthias Zwicker
- Abstract summary: Traditional 2D animation is labor-intensive, often requiring animators to draw twelve illustrations per second of movement.
Lower framerates result in larger displacements and occlusions, discrete perceptual elements (e.g. lines and solid-color regions) pose difficulties for texture-oriented convolutional networks.
Previous work tried addressing these issues, but used unscalable methods and focused on pixel-perfect performance.
We build a scalable system more appropriately centered on perceptual quality for this artistic domain.
- Score: 37.04208600867858
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional 2D animation is labor-intensive, often requiring animators to
manually draw twelve illustrations per second of movement. While automatic
frame interpolation may ease this burden, the artistic effects inherent to 2D
animation make video synthesis particularly challenging compared to in the
photorealistic domain. Lower framerates result in larger displacements and
occlusions, discrete perceptual elements (e.g. lines and solid-color regions)
pose difficulties for texture-oriented convolutional networks, and exaggerated
nonlinear movements hinder training data collection. Previous work tried
addressing these issues, but used unscalable methods and focused on
pixel-perfect performance. In contrast, we build a scalable system more
appropriately centered on perceptual quality for this artistic domain. Firstly,
we propose a lightweight architecture with a simple yet effective
occlusion-inpainting technique to improve convergence on perceptual metrics
with fewer trainable parameters. Secondly, we design a novel auxiliary module
that leverages the Euclidean distance transform to improve the preservation of
key line and region structures. Thirdly, we automatically double the existing
manually-collected dataset for this task by quantitatively filtering out
movement nonlinearities, allowing us to improve model generalization. Finally,
we establish LPIPS and chamfer distance as strongly preferable to PSNR and SSIM
through a user study, validating our system's emphasis on perceptual quality in
the 2D animation domain.
Related papers
- See-through: Single-image Layer Decomposition for Anime Characters [11.629918493740263]
We introduce a framework that automates the transformation of static anime illustrations into manipulatable 2.5D models.<n>Our approach overcomes this by decomposing a single image into fully inpainted, semantically distinct layers with inferred drawing orders.<n>We demonstrate that our approach yields high-fidelity, manipulatable models suitable for professional, real-time animation applications.
arXiv Detail & Related papers (2026-02-03T17:12:36Z) - Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding [54.859943475818234]
We present Motion4D, a novel framework that integrates 2D priors from foundation models into a unified 4D Gaussian Splatting representation.<n>Our method features a two-part iterative optimization framework: 1) Sequential optimization, which updates motion and semantic fields in consecutive stages to maintain local consistency, and 2) Global optimization, which jointly refines all attributes for long-term coherence.<n>Our method significantly outperforms both 2D foundation models and existing 3D-based approaches across diverse scene understanding tasks, including point-based tracking, video object segmentation, and novel view synthesis.
arXiv Detail & Related papers (2025-12-03T09:32:56Z) - 4-Doodle: Text to 3D Sketches that Move! [60.89021458068987]
4-Doodle is the first training-free framework for generating dynamic 3D sketches from text.<n>Our method produces temporally realistic and structurally stable 3D sketch animations, outperforming existing baselines in both fidelity and controllability.
arXiv Detail & Related papers (2025-10-29T09:33:29Z) - Puppeteer: Rig and Animate Your 3D Models [105.11046762553121]
Puppeteer is a comprehensive framework that addresses both automatic rigging and animation for diverse 3D objects.<n>Our system first predicts plausible skeletal structures via an auto-regressive transformer.<n>It then infers skinning weights via an attention-based architecture.
arXiv Detail & Related papers (2025-08-14T17:59:31Z) - Occlusion-robust Stylization for Drawing-based 3D Animation [20.793887576117527]
We propose Occlusion-robust Stylization Framework (OSF) for drawing-based 3D animation.<n> OSF operates in a single run instead of the previous two-stage method, achieving 2.4x faster inference and 2.1x less memory.
arXiv Detail & Related papers (2025-08-01T07:52:07Z) - Every Painting Awakened: A Training-free Framework for Painting-to-Animation Generation [25.834500552609136]
We introduce a training-free framework specifically designed to bring real-world static paintings to life through image-to-video (I2V) synthesis.
Existing I2V methods, primarily trained on natural video datasets, often struggle to generate dynamic outputs from static paintings.
Our framework enables plug-and-play integration with existing I2V methods, making it an ideal solution for animating real-world paintings.
arXiv Detail & Related papers (2025-03-31T05:25:49Z) - One-shot Human Motion Transfer via Occlusion-Robust Flow Prediction and Neural Texturing [21.613055849276385]
We propose a unified framework that combines multi-scale feature warping and neural texture mapping to recover better 2D appearance and 2.5D geometry.
Our model takes advantage of multiple modalities by jointly training and fusing them, which allows it to robust neural texture features that cope with geometric errors.
arXiv Detail & Related papers (2024-12-09T03:14:40Z) - Thin-Plate Spline-based Interpolation for Animation Line Inbetweening [54.69811179222127]
Chamfer Distance (CD) is commonly adopted for evaluating inbetweening performance.
We propose a simple yet effective method for animation line inbetweening that adopts thin-plate spline-based transformation.
Our method outperforms existing approaches by delivering high-quality results with enhanced fluidity.
arXiv Detail & Related papers (2024-08-17T08:05:31Z) - An Animation-based Augmentation Approach for Action Recognition from Discontinuous Video [11.293897932762809]
Action recognition, an essential component of computer vision, plays a pivotal role in multiple applications.
CNNs suffer performance declines when trained with discontinuous video frames, which is a frequent scenario in real-world settings.
To overcome this issue, we introduce the 4A pipeline, which employs a series of sophisticated techniques.
arXiv Detail & Related papers (2024-04-10T04:59:51Z) - Bridging the Gap: Sketch-Aware Interpolation Network for High-Quality Animation Sketch Inbetweening [58.09847349781176]
We propose a novel deep learning method - Sketch-Aware Interpolation Network (SAIN)
This approach incorporates multi-level guidance that formulates region-level correspondence, stroke-level correspondence and pixel-level dynamics.
A multi-stream U-Transformer is then devised to characterize sketch inbetweening patterns using these multi-level guides through the integration of self / cross-attention mechanisms.
arXiv Detail & Related papers (2023-08-25T09:51:03Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - Image2Gif: Generating Continuous Realistic Animations with Warping NODEs [0.8218964199015377]
We propose a new framework, Warping Neural ODE, for generating a smooth animation (video frame) in a continuous manner.
This allows us to achieve the smoothness and the realism of an animation with infinitely small time steps between the frames.
We show the application of our work in generating an animation given two frames, in different training settings, including Generative Adversarial Network (GAN) and with $L$ loss.
arXiv Detail & Related papers (2022-05-09T18:39:47Z) - Decoupled Spatial-Temporal Transformer for Video Inpainting [77.8621673355983]
Video aims to fill the given holes with realistic appearance but is still a challenging task even with prosperous deep learning approaches.
Recent works introduce the promising Transformer architecture into deep video inpainting and achieve better performance.
We propose a Decoupled Spatial-Temporal Transformer (DSTT) for improving video inpainting with exceptional efficiency.
arXiv Detail & Related papers (2021-04-14T05:47:46Z) - Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time.
We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner.
Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.