Related papers: Non-Adversarial Video Synthesis with Learned Priors

Non-Adversarial Video Synthesis with Learned Priors

URL: http://arxiv.org/abs/2003.09565v3
Date: Fri, 17 Apr 2020 20:54:58 GMT
Title: Non-Adversarial Video Synthesis with Learned Priors
Authors: Abhishek Aich, Akash Gupta, Rameswar Panda, Rakib Hyder, M. Salman Asif, Amit K. Roy-Chowdhury
Abstract summary: We focus on the problem of generating videos from latent noise vectors, without any reference input frames. We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Our approach generates superior quality videos compared to the existing state-of-the-art methods.
Score: 53.26777815740381
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames. To this end, we develop a novel approach that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Optimizing for the input latent space along with the network weights allows us to generate videos in a controlled environment, i.e., we can faithfully generate all videos the model has seen during the learning process as well as new unseen videos. Extensive experiments on three challenging and diverse datasets well demonstrate that our approach generates superior quality videos compared to the existing state-of-the-art methods.

Related papers

AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset [55.82208863521353]
We propose AccVideo to reduce the inference steps for accelerating video diffusion models with synthetic dataset. Our model achieves 8.5x improvements in generation speed compared to the teacher model. Compared to previous accelerating methods, our approach is capable of generating videos with higher quality and resolution.
arXiv Detail & Related papers (2025-03-25T08:52:07Z)
Optical-Flow Guided Prompt Optimization for Coherent Video Generation [51.430833518070145]
We propose a framework called MotionPrompt that guides the video generation process via optical flow. We optimize learnable token embeddings during reverse sampling steps by using gradients from a trained discriminator applied to random frame pairs. This approach allows our method to generate visually coherent video sequences that closely reflect natural motion dynamics, without compromising the fidelity of the generated content.
arXiv Detail & Related papers (2024-11-23T12:26:52Z)
Video to Video Generative Adversarial Network for Few-shot Learning Based on Policy Gradient [12.07088416665005]
We propose RL-V2V-GAN, a new deep neural network approach for conditional conditional-to-video synthesis. While preserving the style of the source video domain, our approach aims to learn a gradient mapping from a source video domain to a target video domain. Our experiments show that RL-V2V-GAN can produce temporally coherent video results.
arXiv Detail & Related papers (2024-10-28T01:35:10Z)
SF-V: Single Forward Video Generation Model [57.292575082410785]
We propose a novel approach to obtain single-step video generation models by leveraging adversarial training to fine-tune pre-trained models. Experiments demonstrate that our method achieves competitive generation quality of synthesized videos with significantly reduced computational overhead.
arXiv Detail & Related papers (2024-06-06T17:58:27Z)
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization [52.63845811751936]
Video pre-training is challenging due to the modeling of its dynamics video. In this paper, we address such limitations in video pre-training with an efficient video decomposition. Our framework is both capable of comprehending and generating image and video content, as demonstrated by its performance across 13 multimodal benchmarks.
arXiv Detail & Related papers (2024-02-05T16:30:49Z)
Video Generation Beyond a Single Clip [76.5306434379088]
Video generation models can only generate video clips that are relatively short compared with the length of real videos. To generate long videos covering diverse content and multiple events, we propose to use additional guidance to control the video generation process. The proposed approach is complementary to existing efforts on video generation, which focus on generating realistic video within a fixed time window.
arXiv Detail & Related papers (2023-04-15T06:17:30Z)
Autoencoding Video Latents for Adversarial Video Generation [0.0]
AVLAE is a two stream latent autoencoder where the video distribution is learned by adversarial training. We demonstrate that our approach learns to disentangle motion and appearance codes even without the explicit structural composition in the generator.
arXiv Detail & Related papers (2022-01-18T11:42:14Z)
Strumming to the Beat: Audio-Conditioned Contrastive Video Textures [112.6140796961121]
We introduce a non-parametric approach for infinite video texture synthesis using a representation learned via contrastive learning. We take inspiration from Video Textures, which showed that plausible new videos could be generated from a single one by stitching its frames together in a novel yet consistent order. Our model outperforms baselines on human perceptual scores, can handle a diverse range of input videos, and can combine semantic and audio-visual cues in order to synthesize videos that synchronize well with an audio signal.
arXiv Detail & Related papers (2021-04-06T17:24:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.