Single-Shot Motion Completion with Transformer
- URL: http://arxiv.org/abs/2103.00776v1
- Date: Mon, 1 Mar 2021 06:00:17 GMT
- Title: Single-Shot Motion Completion with Transformer
- Authors: Yinglin Duan (1), Tianyang Shi (1), Zhengxia Zou (2), Yenan Lin (3),
Zhehui Qian (3), Bohan Zhang (3), Yi Yuan (1) ((1) NetEase Fuxi AI Lab, (2)
University of Michigan, (3) NetEase)
- Abstract summary: We propose a simple but effective method to solve multiple motion completion problems under a unified framework.
Inspired by the recent great success of attention-based models, we consider the completion as a sequence to sequence prediction problem.
Our method can run in a non-autoregressive manner and predict multiple missing frames within a single forward propagation in real time.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motion completion is a challenging and long-discussed problem, which is of
great significance in film and game applications. For different motion
completion scenarios (in-betweening, in-filling, and blending), most previous
methods deal with the completion problems with case-by-case designs. In this
work, we propose a simple but effective method to solve multiple motion
completion problems under a unified framework and achieves a new state of the
art accuracy under multiple evaluation settings. Inspired by the recent great
success of attention-based models, we consider the completion as a sequence to
sequence prediction problem. Our method consists of two modules - a standard
transformer encoder with self-attention that learns long-range dependencies of
input motions, and a trainable mixture embedding module that models temporal
information and discriminates key-frames. Our method can run in a
non-autoregressive manner and predict multiple missing frames within a single
forward propagation in real time. We finally show the effectiveness of our
method in music-dance applications.
Related papers
- Human Motion Synthesis_ A Diffusion Approach for Motion Stitching and In-Betweening [2.5165775267615205]
We propose a diffusion model with a transformer-based denoiser to generate realistic human motion.
Our method demonstrated strong performance in generating in-betweening sequences.
We present the performance evaluation of our method using quantitative metrics such as Frechet Inception Distance (FID), Diversity, and Multimodality.
arXiv Detail & Related papers (2024-09-10T18:02:32Z) - Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - MotionMix: Weakly-Supervised Diffusion for Controllable Motion
Generation [19.999239668765885]
MotionMix is a weakly-supervised diffusion model that leverages both noisy and unannotated motion sequences.
Our framework consistently achieves state-of-the-art performances on text-to-motion, action-to-motion, and music-to-dance tasks.
arXiv Detail & Related papers (2024-01-20T04:58:06Z) - Motion Flow Matching for Human Motion Synthesis and Editing [75.13665467944314]
We propose emphMotion Flow Matching, a novel generative model for human motion generation featuring efficient sampling and effectiveness in motion editing applications.
Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks.
arXiv Detail & Related papers (2023-12-14T12:57:35Z) - Shuffled Autoregression For Motion Interpolation [53.61556200049156]
This work aims to provide a deep-learning solution for the motion task.
We propose a novel framework, referred to as emphShuffled AutoRegression, which expands the autoregression to generate in arbitrary (shuffled) order.
We also propose an approach to constructing a particular kind of dependency graph, with three stages assembled into an end-to-end spatial-temporal motion Transformer.
arXiv Detail & Related papers (2023-06-10T07:14:59Z) - HumanMAC: Masked Motion Completion for Human Motion Prediction [62.279925754717674]
Human motion prediction is a classical problem in computer vision and computer graphics.
Previous effects achieve great empirical performance based on an encoding-decoding style.
In this paper, we propose a novel framework from a new perspective.
arXiv Detail & Related papers (2023-02-07T18:34:59Z) - Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance [83.25826307000717]
We study the challenging problem of recovering detailed motion from a single motion-red image.
Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region.
In this paper, we explicitly account for such motion ambiguity, allowing us to generate multiple plausible solutions all in sharp detail.
arXiv Detail & Related papers (2022-07-20T18:05:53Z) - Real-time Controllable Motion Transition for Characters [14.88407656218885]
Real-time in-between motion generation is universally required in games and highly desirable in existing animation pipelines.
Our approach consists of two key components: motion manifold and conditional transitioning.
We show that our method is able to generate high-quality motions measured under multiple metrics.
arXiv Detail & Related papers (2022-05-05T10:02:54Z) - Learning Salient Boundary Feature for Anchor-free Temporal Action
Localization [81.55295042558409]
Temporal action localization is an important yet challenging task in video understanding.
We propose the first purely anchor-free temporal localization method.
Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module, and (iii) several consistency constraints.
arXiv Detail & Related papers (2021-03-24T12:28:32Z) - Convolutional Autoencoders for Human Motion Infilling [37.16099544563645]
Motion infilling aims to complete the missing gap in between, such that the filled in poses plausibly forecast the start sequence and naturally transition into the end sequence.
We show that a single model can be used to create natural transitions between different types of activities.
Our method is not only able to fill in entire missing frames, but it can also be used to complete gaps where partial poses are available.
arXiv Detail & Related papers (2020-10-22T08:45:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.