Generative Adversarial Networks for Image and Video Synthesis:
Algorithms and Applications
- URL: http://arxiv.org/abs/2008.02793v2
- Date: Mon, 30 Nov 2020 18:18:55 GMT
- Title: Generative Adversarial Networks for Image and Video Synthesis:
Algorithms and Applications
- Authors: Ming-Yu Liu, Xun Huang, Jiahui Yu, Ting-Chun Wang, Arun Mallya
- Abstract summary: The generative adversarial network (GAN) framework has emerged as a powerful tool for various image and video synthesis tasks.
We provide an overview of GANs with a special focus on algorithms and applications for visual synthesis.
- Score: 46.86183957129848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generative adversarial network (GAN) framework has emerged as a powerful
tool for various image and video synthesis tasks, allowing the synthesis of
visual content in an unconditional or input-conditional manner. It has enabled
the generation of high-resolution photorealistic images and videos, a task that
was challenging or impossible with prior methods. It has also led to the
creation of many new applications in content creation. In this paper, we
provide an overview of GANs with a special focus on algorithms and applications
for visual synthesis. We cover several important techniques to stabilize GAN
training, which has a reputation for being notoriously difficult. We also
discuss its applications to image translation, image processing, video
synthesis, and neural rendering.
Related papers
- Make-Your-Video: Customized Video Generation Using Textual and
Structural Guidance [36.26032505627126]
Recent advancements in text-to-video synthesis have unveiled the potential to achieve this with prompts only.
In this paper, we explore customized video generation by utilizing text as context description and motion structure.
Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model.
arXiv Detail & Related papers (2023-06-01T17:43:27Z) - Learning Universal Policies via Text-Guided Video Generation [179.6347119101618]
A goal of artificial intelligence is to construct an agent that can solve a wide variety of tasks.
Recent progress in text-guided image synthesis has yielded models with an impressive ability to generate complex novel images.
We investigate whether such tools can be used to construct more general-purpose agents.
arXiv Detail & Related papers (2023-01-31T21:28:13Z) - A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
arXiv Detail & Related papers (2021-12-09T18:59:21Z) - A Survey on Adversarial Image Synthesis [0.0]
We provide a taxonomy of methods used in image synthesis, review different models for text-to-image synthesis and image-to-image translation, and discuss some evaluation metrics as well as possible future research directions in image synthesis with GAN.
arXiv Detail & Related papers (2021-06-30T13:31:48Z) - A Good Image Generator Is What You Need for High-Resolution Video
Synthesis [73.82857768949651]
We present a framework that leverages contemporary image generators to render high-resolution videos.
We frame the video synthesis problem as discovering a trajectory in the latent space of a pre-trained and fixed image generator.
We introduce a motion generator that discovers the desired trajectory, in which content and motion are disentangled.
arXiv Detail & Related papers (2021-04-30T15:38:41Z) - TiVGAN: Text to Image to Video Generation with Step-by-Step Evolutionary
Generator [34.7504057664375]
We propose a novel training framework, Text-to-Image-to-Video Generative Adversarial Network (TiVGAN), which evolves frame-by-frame and finally produces a full-length video.
Step-by-step learning process helps stabilize the training and enables the creation of high-resolution video based on conditional text descriptions.
arXiv Detail & Related papers (2020-09-04T06:33:08Z) - Efficient Neural Architecture for Text-to-Image Synthesis [6.166295570030645]
We show that an effective neural architecture can achieve state-of-the-art performance using a single stage training with a single generator and a single discriminator.
Our work points a new direction for text-to-image research, which has not experimented with novel neural architectures recently.
arXiv Detail & Related papers (2020-04-23T19:33:40Z) - Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood
Estimation [54.17177006826262]
We develop a new generic conditional image synthesis method based on Implicit Maximum Likelihood Estimation (IMLE)
We demonstrate improved multimodal image synthesis performance on two tasks, single image super-resolution and image synthesis from scene layouts.
arXiv Detail & Related papers (2020-04-07T03:06:55Z) - Non-Adversarial Video Synthesis with Learned Priors [53.26777815740381]
We focus on the problem of generating videos from latent noise vectors, without any reference input frames.
We develop a novel approach that jointly optimize the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning.
Our approach generates superior quality videos compared to the existing state-of-the-art methods.
arXiv Detail & Related papers (2020-03-21T02:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.