Non-Adversarial Video Synthesis with Learned Priors
- URL: http://arxiv.org/abs/2003.09565v3
- Date: Fri, 17 Apr 2020 20:54:58 GMT
- Title: Non-Adversarial Video Synthesis with Learned Priors
- Authors: Abhishek Aich, Akash Gupta, Rameswar Panda, Rakib Hyder, M. Salman
Asif, Amit K. Roy-Chowdhury
- Abstract summary: We focus on the problem of generating videos from latent noise vectors, without any reference input frames.
We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning.
Our approach generates superior quality videos compared to the existing state-of-the-art methods.
- Score: 53.26777815740381
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of the existing works in video synthesis focus on generating videos
using adversarial learning. Despite their success, these methods often require
input reference frame or fail to generate diverse videos from the given data
distribution, with little to no uniformity in the quality of videos that can be
generated. Different from these methods, we focus on the problem of generating
videos from latent noise vectors, without any reference input frames. To this
end, we develop a novel approach that jointly optimizes the input latent space,
the weights of a recurrent neural network and a generator through
non-adversarial learning. Optimizing for the input latent space along with the
network weights allows us to generate videos in a controlled environment, i.e.,
we can faithfully generate all videos the model has seen during the learning
process as well as new unseen videos. Extensive experiments on three
challenging and diverse datasets well demonstrate that our approach generates
superior quality videos compared to the existing state-of-the-art methods.
Related papers
- Optical-Flow Guided Prompt Optimization for Coherent Video Generation [51.430833518070145]
We propose a framework called MotionPrompt that guides the video generation process via optical flow.
We optimize learnable token embeddings during reverse sampling steps by using gradients from a trained discriminator applied to random frame pairs.
This approach allows our method to generate visually coherent video sequences that closely reflect natural motion dynamics, without compromising the fidelity of the generated content.
arXiv Detail & Related papers (2024-11-23T12:26:52Z) - Video to Video Generative Adversarial Network for Few-shot Learning Based on Policy Gradient [12.07088416665005]
We propose RL-V2V-GAN, a new deep neural network approach for conditional conditional-to-video synthesis.
While preserving the style of the source video domain, our approach aims to learn a gradient mapping from a source video domain to a target video domain.
Our experiments show that RL-V2V-GAN can produce temporally coherent video results.
arXiv Detail & Related papers (2024-10-28T01:35:10Z) - SF-V: Single Forward Video Generation Model [57.292575082410785]
We propose a novel approach to obtain single-step video generation models by leveraging adversarial training to fine-tune pre-trained models.
Experiments demonstrate that our method achieves competitive generation quality of synthesized videos with significantly reduced computational overhead.
arXiv Detail & Related papers (2024-06-06T17:58:27Z) - Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization [52.63845811751936]
Video pre-training is challenging due to the modeling of its dynamics video.
In this paper, we address such limitations in video pre-training with an efficient video decomposition.
Our framework is both capable of comprehending and generating image and video content, as demonstrated by its performance across 13 multimodal benchmarks.
arXiv Detail & Related papers (2024-02-05T16:30:49Z) - Video Generation Beyond a Single Clip [76.5306434379088]
Video generation models can only generate video clips that are relatively short compared with the length of real videos.
To generate long videos covering diverse content and multiple events, we propose to use additional guidance to control the video generation process.
The proposed approach is complementary to existing efforts on video generation, which focus on generating realistic video within a fixed time window.
arXiv Detail & Related papers (2023-04-15T06:17:30Z) - Autoencoding Video Latents for Adversarial Video Generation [0.0]
AVLAE is a two stream latent autoencoder where the video distribution is learned by adversarial training.
We demonstrate that our approach learns to disentangle motion and appearance codes even without the explicit structural composition in the generator.
arXiv Detail & Related papers (2022-01-18T11:42:14Z) - Strumming to the Beat: Audio-Conditioned Contrastive Video Textures [112.6140796961121]
We introduce a non-parametric approach for infinite video texture synthesis using a representation learned via contrastive learning.
We take inspiration from Video Textures, which showed that plausible new videos could be generated from a single one by stitching its frames together in a novel yet consistent order.
Our model outperforms baselines on human perceptual scores, can handle a diverse range of input videos, and can combine semantic and audio-visual cues in order to synthesize videos that synchronize well with an audio signal.
arXiv Detail & Related papers (2021-04-06T17:24:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.