SILK: Smooth InterpoLation frameworK for motion in-betweening A Simplified Computational Approach
- URL: http://arxiv.org/abs/2506.09075v1
- Date: Mon, 09 Jun 2025 19:26:27 GMT
- Title: SILK: Smooth InterpoLation frameworK for motion in-betweening A Simplified Computational Approach
- Authors: Elly Akhoundi, Hung Yu Ling, Anup Anand Deshmukh, Judith Butepage,
- Abstract summary: Motion in-betweening is a crucial tool for animators, enabling control over pose-level details in each pose.<n>Recent machine learning solutions for motion in-betweening rely on complex models, skeleton-aware architectures or requiring multiple modules and training steps.<n>We introduce a simple yet effective Transformer-based framework, employing a single encoder to synthesize realistic motions for motion in-betweening tasks.
- Score: 1.7812314225208412
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motion in-betweening is a crucial tool for animators, enabling intricate control over pose-level details in each keyframe. Recent machine learning solutions for motion in-betweening rely on complex models, incorporating skeleton-aware architectures or requiring multiple modules and training steps. In this work, we introduce a simple yet effective Transformer-based framework, employing a single Transformer encoder to synthesize realistic motions for motion in-betweening tasks. We find that data modeling choices play a significant role in improving in-betweening performance. Among others, we show that increasing data volume can yield equivalent or improved motion transitions, that the choice of pose representation is vital for achieving high-quality results, and that incorporating velocity input features enhances animation performance. These findings challenge the assumption that model complexity is the primary determinant of animation quality and provide insights into a more data-centric approach to motion interpolation. Additional videos and supplementary material are available at https://silk-paper.github.io.
Related papers
- A Unified Transformer-Based Framework with Pretraining For Whole Body Grasping Motion Generation [6.465569743109499]
We present a novel transformer-based framework for whole-body grasping.<n>It addresses pose generation and motion infilling, enabling realistic and stable object interactions.<n>Our method outperforms state-of-the-art baselines in terms of coherence, stability, and visual realism.
arXiv Detail & Related papers (2025-07-01T11:18:23Z) - AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models [5.224806515926022]
We introduce AnyMoLe, a novel method to generate motion in-between frames for arbitrary characters without external data.<n>Our approach employs a two-stage frame generation process to enhance contextual understanding.
arXiv Detail & Related papers (2025-03-11T13:28:59Z) - Instance-Level Moving Object Segmentation from a Single Image with Events [84.12761042512452]
Moving object segmentation plays a crucial role in understanding dynamic scenes involving multiple moving objects.<n>Previous methods encounter difficulties in distinguishing whether pixel displacements of an object are caused by camera motion or object motion.<n>Recent advances exploit the motion sensitivity of novel event cameras to counter conventional images' inadequate motion modeling capabilities.<n>We propose the first instance-level moving object segmentation framework that integrates complementary texture and motion cues.
arXiv Detail & Related papers (2025-02-18T15:56:46Z) - Motion-Aware Generative Frame Interpolation [23.380470636851022]
Flow-based frame methods ensure motion stability through estimated intermediate flow but often introduce severe artifacts in complex motion regions.<n>Recent generative approaches, boosted by large-scale pre-trained video generation models, show promise in handling intricate scenes.<n>We propose Motion-aware Generative frame (MoG) that synergizes intermediate flow guidance with generative capacities to enhance fidelity.
arXiv Detail & Related papers (2025-01-07T11:03:43Z) - Thin-Plate Spline-based Interpolation for Animation Line Inbetweening [54.69811179222127]
Chamfer Distance (CD) is commonly adopted for evaluating inbetweening performance.
We propose a simple yet effective method for animation line inbetweening that adopts thin-plate spline-based transformation.
Our method outperforms existing approaches by delivering high-quality results with enhanced fluidity.
arXiv Detail & Related papers (2024-08-17T08:05:31Z) - Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics [67.97235923372035]
We present Puppet-Master, an interactive video generative model that can serve as a motion prior for part-level dynamics.
At test time, given a single image and a sparse set of motion trajectories, Puppet-Master can synthesize a video depicting realistic part-level motion faithful to the given drag interactions.
arXiv Detail & Related papers (2024-08-08T17:59:38Z) - Render In-between: Motion Guided Video Synthesis for Action
Interpolation [53.43607872972194]
We propose a motion-guided frame-upsampling framework that is capable of producing realistic human motion and appearance.
A novel motion model is trained to inference the non-linear skeletal motion between frames by leveraging a large-scale motion-capture dataset.
Our pipeline only requires low-frame-rate videos and unpaired human motion data but does not require high-frame-rate videos for training.
arXiv Detail & Related papers (2021-11-01T15:32:51Z) - EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content.
First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events.
Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z) - Robust Motion In-betweening [17.473287573543065]
We present a novel, robust transition generation technique that can serve as a new tool for 3D animators.
The system synthesizes high-quality motions that use temporally-sparsers as animation constraints.
We present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios.
arXiv Detail & Related papers (2021-02-09T16:52:45Z) - Motion-Attentive Transition for Zero-Shot Video Object Segmentation [99.44383412488703]
We present a Motion-Attentive Transition Network (MATNet) for zero-shot object segmentation.
An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder.
In this way, the encoder becomes deeply internative, allowing for closely hierarchical interactions between object motion and appearance.
arXiv Detail & Related papers (2020-03-09T16:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.