Related papers: JNMR: Joint Non-linear Motion Regression for Video Frame Interpolation

JNMR: Joint Non-linear Motion Regression for Video Frame Interpolation

URL: http://arxiv.org/abs/2206.04231v3
Date: Sun, 10 Sep 2023 05:16:07 GMT
Title: JNMR: Joint Non-linear Motion Regression for Video Frame Interpolation
Authors: Meiqin Liu, Chenming Xu, Chao Yao, Chunyu Lin, and Yao Zhao
Abstract summary: Video frame (VFI) aims to generate frames by warping learnable motions from the bidirectional historical references. We reformulate VFI as a Joint Non-linear Motion Regression (JNMR) strategy to model the complicated motions of inter-frame. We show that the effectiveness and significant improvement of joint motion regression compared with state-of-the-art methods.
Score: 47.123769305867775
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video frame interpolation (VFI) aims to generate predictive frames by warping learnable motions from the bidirectional historical references. Most existing works utilize spatio-temporal semantic information extractor to realize motion estimation and interpolation modeling. However, they insufficiently consider the real mechanistic rationality of generated middle motions. In this paper, we reformulate VFI as a Joint Non-linear Motion Regression (JNMR) strategy to model the complicated motions of inter-frame. Specifically, the motion trajectory between the target frame and the multiple reference frames is regressed by a temporal concatenation of multi-stage quadratic models. ConvLSTM is adopted to construct this joint distribution of complete motions in temporal dimension. Moreover, the feature learning network is designed to optimize for the joint regression modeling. A coarse-to-fine synthesis enhancement module is also conducted to learn visual dynamics at different resolutions through repetitive regression and interpolation. Experimental results on VFI show that the effectiveness and significant improvement of joint motion regression compared with the state-of-the-art methods. The code is available at https://github.com/ruhig6/JNMR.

Related papers

ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer [58.49950218437718]
We present ReCoM, an efficient framework for generating high-fidelity and generalizable human body motions synchronized with speech. The core innovation lies in the Recurrent Embedded Transformer (RET), which integrates Dynamic Embedding Regularization (DER) into a Vision Transformer (ViT) core architecture. To enhance model robustness, we incorporate the proposed DER strategy, which equips the model with dual capabilities of noise resistance and cross-domain generalization.
arXiv Detail & Related papers (2025-03-27T16:39:40Z)
Generalizable Implicit Motion Modeling for Video Frame Interpolation [51.966062283735596]
Motion is critical in flow-based Video Frame Interpolation (VFI) We introduce General Implicit Motion Modeling (IMM), a novel and effective approach to motion modeling VFI. Our GIMM can be easily integrated with existing flow-based VFI works by supplying accurately modeled motion.
arXiv Detail & Related papers (2024-07-11T17:13:15Z)
Motion-aware Latent Diffusion Models for Video Frame Interpolation [51.78737270917301]
Motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity. We propose a novel diffusion framework, motion-aware latent diffusion models (MADiff) Our method achieves state-of-the-art performance significantly outperforming existing approaches.
arXiv Detail & Related papers (2024-04-21T05:09:56Z)
Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames. It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z)
Spatial-Temporal Transformer based Video Compression Framework [44.723459144708286]
We propose a novel Spatial-Temporal Transformer based Video Compression (STT-VC) framework. It contains a Relaxed Deformable Transformer (RDT) with Uformer based offsets estimation for motion estimation and compensation, a Multi-Granularity Prediction (MGP) module based on multi-reference frames for prediction refinement, and a Spatial Feature Distribution prior based Transformer (SFD-T) for efficient temporal-spatial joint residual compression. Experimental results demonstrate that our method achieves the best result with 13.5% BD-Rate saving over VTM.
arXiv Detail & Related papers (2023-09-21T09:23:13Z)
Shuffled Autoregression For Motion Interpolation [53.61556200049156]
This work aims to provide a deep-learning solution for the motion task. We propose a novel framework, referred to as emphShuffled AutoRegression, which expands the autoregression to generate in arbitrary (shuffled) order. We also propose an approach to constructing a particular kind of dependency graph, with three stages assembled into an end-to-end spatial-temporal motion Transformer.
arXiv Detail & Related papers (2023-06-10T07:14:59Z)
Enhanced Bi-directional Motion Estimation for Video Frame Interpolation [0.05541644538483946]
We present a novel yet effective algorithm for motion-based video frame estimation. Our method achieves excellent performance on a broad range of video frame benchmarks.
arXiv Detail & Related papers (2022-06-17T06:08:43Z)
Learning a Generative Motion Model from Image Sequences based on a Latent Motion Matrix [8.774604259603302]
We learn a probabilistic motion model from simulating temporal-temporal registration in a sequence of images. We show improved registration accuracy-temporally smoother consistencys compared to three state-of-the-art registration algorithms. We also demonstrate the model's applicability for motion analysis, simulation and super-resolution by an improved motion reconstruction from sequences with missing frames.
arXiv Detail & Related papers (2020-11-03T14:44:09Z)
All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling [52.425236515695914]
State-of-the-art methods are iterative solutions interpolating one frame at the time. This work introduces a true multi-frame interpolator. It utilizes a pyramidal style network in the temporal domain to complete the multi-frame task in one-shot.
arXiv Detail & Related papers (2020-07-23T02:34:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.