Texture-aware Video Frame Interpolation
- URL: http://arxiv.org/abs/2102.13520v1
- Date: Fri, 26 Feb 2021 14:46:56 GMT
- Title: Texture-aware Video Frame Interpolation
- Authors: Duolikun Danier and David Bull
- Abstract summary: We study the impact of video textures on video frame synthesis, and propose a novel framework where, given an algorithm, separate models are trained on different textures.
Our study shows that video texture has significant impact on the performance of frame models and it is beneficial to have separate models specifically adapted to these texture classes, instead of training a single model that tries to learn generic motion.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Temporal interpolation has the potential to be a powerful tool for video
compression. Existing methods for frame interpolation do not discriminate
between video textures and generally invoke a single general model capable of
interpolating a wide range of video content. However, past work on video
texture analysis and synthesis has shown that different textures exhibit vastly
different motion characteristics and they can be divided into three classes
(static, dynamic continuous and dynamic discrete). In this work, we study the
impact of video textures on video frame interpolation, and propose a novel
framework where, given an interpolation algorithm, separate models are trained
on different textures. Our study shows that video texture has significant
impact on the performance of frame interpolation models and it is beneficial to
have separate models specifically adapted to these texture classes, instead of
training a single model that tries to learn generic motion. Our results
demonstrate that models fine-tuned using our framework achieve, on average, a
0.3dB gain in PSNR on the test set used.
Related papers
- Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models [54.35214051961381]
3D meshes are widely used in computer vision and graphics for their efficiency in animation and minimal memory use in movies, games, AR, and VR.
However, creating temporal consistent and realistic textures for mesh remains labor-intensive for professional artists.
We present 3D Tex sequences that integrates inherent geometry from mesh sequences with video diffusion models to produce consistent textures.
arXiv Detail & Related papers (2024-10-14T17:59:59Z) - ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation [81.90265212988844]
We propose a training-free video method for generative video models in a plug-and-play manner.
We transform a video model into a self-cascaded video diffusion model with the designed hidden state correction modules.
Our training-free method is even comparable to trained models supported by huge compute resources and large-scale datasets.
arXiv Detail & Related papers (2024-06-03T00:31:13Z) - FusionFrames: Efficient Architectural Aspects for Text-to-Video
Generation Pipeline [4.295130967329365]
This paper presents a new two-stage latent diffusion text-to-video generation architecture based on the text-to-image diffusion model.
The design of our model significantly reduces computational costs compared to other masked frame approaches.
We evaluate different configurations of MoVQ-based video decoding scheme to improve consistency and achieve higher PSNR, SSIM, MSE, and LPIPS scores.
arXiv Detail & Related papers (2023-11-22T00:26:15Z) - Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation [93.18163456287164]
This paper proposes a novel text-guided video-to-video translation framework to adapt image models to videos.
Our framework achieves global style and local texture temporal consistency at a low cost.
arXiv Detail & Related papers (2023-06-13T17:52:23Z) - Efficient Video Segmentation Models with Per-frame Inference [117.97423110566963]
We focus on improving the temporal consistency without introducing overhead in inference.
We propose several techniques to learn from the video sequence, including a temporal consistency loss and online/offline knowledge distillation methods.
arXiv Detail & Related papers (2022-02-24T23:51:36Z) - Render In-between: Motion Guided Video Synthesis for Action
Interpolation [53.43607872972194]
We propose a motion-guided frame-upsampling framework that is capable of producing realistic human motion and appearance.
A novel motion model is trained to inference the non-linear skeletal motion between frames by leveraging a large-scale motion-capture dataset.
Our pipeline only requires low-frame-rate videos and unpaired human motion data but does not require high-frame-rate videos for training.
arXiv Detail & Related papers (2021-11-01T15:32:51Z) - Strumming to the Beat: Audio-Conditioned Contrastive Video Textures [112.6140796961121]
We introduce a non-parametric approach for infinite video texture synthesis using a representation learned via contrastive learning.
We take inspiration from Video Textures, which showed that plausible new videos could be generated from a single one by stitching its frames together in a novel yet consistent order.
Our model outperforms baselines on human perceptual scores, can handle a diverse range of input videos, and can combine semantic and audio-visual cues in order to synthesize videos that synchronize well with an audio signal.
arXiv Detail & Related papers (2021-04-06T17:24:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.