CAST: Character labeling in Animation using Self-supervision by Tracking
- URL: http://arxiv.org/abs/2201.07619v1
- Date: Wed, 19 Jan 2022 14:21:43 GMT
- Title: CAST: Character labeling in Animation using Self-supervision by Tracking
- Authors: Oron Nir, Gal Rapoport, Ariel Shamir
- Abstract summary: Cartoons and animation domain videos have very different characteristics compared to real-life images and videos.
Current computer vision and deep-learning solutions often fail on animated content because they were trained on natural images.
We present a method to refine a semantic representation suitable for specific animated content.
- Score: 6.57697269659615
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Cartoons and animation domain videos have very different characteristics
compared to real-life images and videos. In addition, this domain carries a
large variability in styles. Current computer vision and deep-learning
solutions often fail on animated content because they were trained on natural
images. In this paper we present a method to refine a semantic representation
suitable for specific animated content. We first train a neural network on a
large-scale set of animation videos and use the mapping to deep features as an
embedding space. Next, we use self-supervision to refine the representation for
any specific animation style by gathering many examples of animated characters
in this style, using a multi-object tracking. These examples are used to define
triplets for contrastive loss training. The refined semantic space allows
better clustering of animated characters even when they have diverse
manifestations. Using this space we can build dictionaries of characters in an
animation videos, and define specialized classifiers for specific stylistic
content (e.g., characters in a specific animation series) with very little user
effort. These classifiers are the basis for automatically labeling characters
in animation videos. We present results on a collection of characters in a
variety of animation styles.
Related papers
- MagicAnime: A Hierarchically Annotated, Multimodal and Multitasking Dataset with Benchmarks for Cartoon Animation Generation [2.700983545680755]
multimodal control is challenging due to the complexity of non-human characters, stylistically diverse motions and fine-grained emotions.<n>We propose the MagicAnime dataset, a large-scale, hierarchically annotated, and multimodal dataset designed to support multiple video generation tasks.<n>We build a set of multi-modal cartoon animation benchmarks, called MagicAnime-Bench, to support the comparisons of different methods in the tasks above.
arXiv Detail & Related papers (2025-07-27T17:53:00Z) - FairyGen: Storied Cartoon Video from a Single Child-Drawn Character [15.701180508477679]
We propose FairyGen, an automatic system for generating story-driven cartoon videos from a single child's drawing.<n>Unlike previous storytelling methods, FairyGen explicitly disentangles character modeling from stylized background generation.<n>Our system produces animations that are stylistically faithful, narratively structured natural motion.
arXiv Detail & Related papers (2025-06-26T13:58:16Z) - AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation [52.655400705690155]
AnimeShooter is a reference-guided multi-shot animation dataset.<n>Story-level annotations provide an overview of the narrative, including the storyline, key scenes, and main character profiles with reference images.<n>Shot-level annotations decompose the story into consecutive shots, each annotated with scene, characters, and both narrative and descriptive visual captions.<n>A separate subset, AnimeShooter-audio, offers synchronized audio tracks for each shot, along with audio descriptions and sound sources.
arXiv Detail & Related papers (2025-06-03T17:55:18Z) - AniDoc: Animation Creation Made Easier [54.97341104616779]
Our research focuses on reducing the labor costs in the production of 2D animation by harnessing the potential of increasingly powerful AI.
AniDoc emerges as a video line art colorization tool, which automatically converts sketch sequences into colored animations.
Our model exploits correspondence matching as an explicit guidance, yielding strong robustness to the variations between the reference character and each line art frame.
arXiv Detail & Related papers (2024-12-18T18:59:59Z) - Animate-X: Universal Character Image Animation with Enhanced Motion Representation [42.73097432203482]
Animate-X is a universal animation framework based on LDM for various character types, including anthropomorphic characters.
We introduce the Pose Indicator, which captures comprehensive motion pattern from the driving video through both implicit and explicit manner.
We also introduce a new Animated Anthropomorphic Benchmark to evaluate the performance of Animate-X on universal and widely applicable animation images.
arXiv Detail & Related papers (2024-10-14T09:06:55Z) - Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control [77.08568533331206]
Follow-Your-Pose v2 can be trained on noisy open-sourced videos readily available on the internet.
Our approach outperforms state-of-the-art methods by a margin of over 35% across 2 datasets and on 7 metrics.
arXiv Detail & Related papers (2024-06-05T08:03:18Z) - AniClipart: Clipart Animation with Text-to-Video Priors [28.76809141136148]
We introduce AniClipart, a system that transforms static images into high-quality motion sequences guided by text-to-video priors.
Experimental results show that the proposed AniClipart consistently outperforms existing image-to-video generation models.
arXiv Detail & Related papers (2024-04-18T17:24:28Z) - Dynamic Typography: Bringing Text to Life via Video Diffusion Prior [73.72522617586593]
We present an automated text animation scheme, termed "Dynamic Typography"
It deforms letters to convey semantic meaning and infuses them with vibrant movements based on user prompts.
Our technique harnesses vector graphics representations and an end-to-end optimization-based framework.
arXiv Detail & Related papers (2024-04-17T17:59:55Z) - AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment [64.02822911038848]
We present AnimateZoo, a zero-shot diffusion-based video generator to produce animal animations.
Key technique used in our AnimateZoo is subject alignment, which includes two steps.
Our model is capable of generating videos characterized by accurate movements, consistent appearance, and high-fidelity frames.
arXiv Detail & Related papers (2024-04-07T12:57:41Z) - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation [27.700371215886683]
diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities.
In this paper, we propose a novel framework tailored for character animation.
By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods.
arXiv Detail & Related papers (2023-11-28T12:27:15Z) - AnimateAnything: Fine-Grained Open Domain Image Animation with Motion
Guidance [13.416296247896042]
We introduce an open domain image animation method that leverages the motion prior of video diffusion model.
Our approach introduces targeted motion area guidance and motion strength guidance, enabling precise control of the movable area and its motion speed.
We validate the effectiveness of our method through rigorous experiments on an open-domain dataset.
arXiv Detail & Related papers (2023-11-21T03:47:54Z) - Deep Animation Video Interpolation in the Wild [115.24454577119432]
In this work, we formally define and study the animation video code problem for the first time.
We propose an effective framework, AnimeInterp, with two dedicated modules in a coarse-to-fine manner.
Notably, AnimeInterp shows favorable perceptual quality and robustness for animation scenarios in the wild.
arXiv Detail & Related papers (2021-04-06T13:26:49Z) - Unpaired Motion Style Transfer from Video to Animation [74.15550388701833]
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation.
We present a novel data-driven framework for motion style transfer, which learns from an unpaired collection of motions with style labels.
Our framework is able to extract motion styles directly from videos, bypassing 3D reconstruction, and apply them to the 3D input motion.
arXiv Detail & Related papers (2020-05-12T13:21:27Z) - First Order Motion Model for Image Animation [90.712718329677]
Image animation consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video.
Our framework addresses this problem without using any annotation or prior information about the specific object to animate.
arXiv Detail & Related papers (2020-02-29T07:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.