MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying
Motions
- URL: http://arxiv.org/abs/2103.02243v2
- Date: Thu, 4 Mar 2021 08:55:59 GMT
- Title: MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying
Motions
- Authors: Haixu Wu, Zhiyu Yao, Mingsheng Long, Jianmin Wang
- Abstract summary: This paper tackles video prediction from a new dimension of predicting spacetime-varying motions that are incessantly across both space and time.
We propose the MotionRNN framework, which can capture the complex variations within motions and adapt to spacetime-varying scenarios.
- Score: 70.30211294212603
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper tackles video prediction from a new dimension of predicting
spacetime-varying motions that are incessantly changing across both space and
time. Prior methods mainly capture the temporal state transitions but overlook
the complex spatiotemporal variations of the motion itself, making them
difficult to adapt to ever-changing motions. We observe that physical world
motions can be decomposed into transient variation and motion trend, while the
latter can be regarded as the accumulation of previous motions. Thus,
simultaneously capturing the transient variation and the motion trend is the
key to make spacetime-varying motions more predictable. Based on these
observations, we propose the MotionRNN framework, which can capture the complex
variations within motions and adapt to spacetime-varying scenarios. MotionRNN
has two main contributions. The first is that we design the MotionGRU unit,
which can model the transient variation and motion trend in a unified way. The
second is that we apply the MotionGRU to RNN-based predictive models and
indicate a new flexible video prediction architecture with a Motion Highway
that can significantly improve the ability to predict changeable motions and
avoid motion vanishing for stacked multiple-layer predictive models. With high
flexibility, this framework can adapt to a series of models for deterministic
spatiotemporal prediction. Our MotionRNN can yield significant improvements on
three challenging benchmarks for video prediction with spacetime-varying
motions.
Related papers
- Generalizable Implicit Motion Modeling for Video Frame Interpolation [51.966062283735596]
Motion is critical in flow-based Video Frame Interpolation (VFI)
General Implicit Motion Modeling (IMM) is a novel and effective approach to motion modeling VFI.
Our GIMM can be smoothly integrated with existing flow-based VFI works without further modifications.
arXiv Detail & Related papers (2024-07-11T17:13:15Z) - ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking [4.250337979548885]
We propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack.
Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns.
We show ETTrack achieves a competitive performance compared with state-of-the-art trackers on DanceTrack and SportsMOT.
arXiv Detail & Related papers (2024-05-24T17:51:33Z) - Spectral Motion Alignment for Video Motion Transfer using Diffusion Models [54.32923808964701]
Spectral Motion Alignment (SMA) is a framework that refines and aligns motion vectors using Fourier and wavelet transforms.
SMA learns motion patterns by incorporating frequency-domain regularization, facilitating the learning of whole-frame global motion dynamics.
Extensive experiments demonstrate SMA's efficacy in improving motion transfer while maintaining computational efficiency and compatibility across various video customization frameworks.
arXiv Detail & Related papers (2024-03-22T14:47:18Z) - Motion-I2V: Consistent and Controllable Image-to-Video Generation with
Explicit Motion Modeling [62.19142543520805]
Motion-I2V is a framework for consistent and controllable image-to-video generation.
It factorizes I2V into two stages with explicit motion modeling.
Motion-I2V's second stage naturally supports zero-shot video-to-video translation.
arXiv Detail & Related papers (2024-01-29T09:06:43Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - Learning Variational Motion Prior for Video-based Motion Capture [31.79649766268877]
We present a novel variational motion prior (VMP) learning approach for video-based motion capture.
Our framework can effectively reduce temporal jittering and failure modes in frame-wise pose estimation.
Experiments over both public datasets and in-the-wild videos have demonstrated the efficacy and generalization capability of our framework.
arXiv Detail & Related papers (2022-10-27T02:45:48Z) - NeMF: Neural Motion Fields for Kinematic Animation [6.570955948572252]
We express the vast motion space as a continuous function over time, hence the name Neural Motion Fields (NeMF)
We use a neural network to learn this function for miscellaneous sets of motions.
We train our model with diverse human motion dataset and quadruped dataset to prove its versatility.
arXiv Detail & Related papers (2022-06-04T05:53:27Z) - Weakly-supervised Action Transition Learning for Stochastic Human Motion
Prediction [81.94175022575966]
We introduce the task of action-driven human motion prediction.
It aims to predict multiple plausible future motions given a sequence of action labels and a short motion history.
arXiv Detail & Related papers (2022-05-31T08:38:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.