HRVGAN: High Resolution Video Generation using Spatio-Temporal GAN
- URL: http://arxiv.org/abs/2008.09646v2
- Date: Mon, 12 Jul 2021 05:47:07 GMT
- Title: HRVGAN: High Resolution Video Generation using Spatio-Temporal GAN
- Authors: Abhinav Sagar
- Abstract summary: We present a novel network for high resolution video generation.
Our network uses ideas from Wasserstein GANs by enforcing k-Lipschitz constraint on the loss term and Conditional GANs using class labels for training and testing.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a novel network for high resolution video
generation. Our network uses ideas from Wasserstein GANs by enforcing
k-Lipschitz constraint on the loss term and Conditional GANs using class labels
for training and testing. We present Generator and Discriminator network
layerwise details along with the combined network architecture, optimization
details and algorithm used in this work. Our network uses a combination of two
loss terms: mean square pixel loss and an adversarial loss. The datasets used
for training and testing our network are UCF101, Golf and Aeroplane Datasets.
Using Inception Score and Fr\'echet Inception Distance as the evaluation
metrics, our network outperforms previous state of the art networks on
unsupervised video generation.
Related papers
- SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method.
We distribute features of space-time tubes evenly across a limited number of learnable clusters.
Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z) - Landslide Detection and Segmentation Using Remote Sensing Images and
Deep Neural Network [42.59806784981723]
Building upon findings from 2022 Landslide4Sense Competition, we propose a deep neural network based system for landslide detection and segmentation.
We use a U-Net trained with Cross Entropy loss as baseline model.
We then improve the U-Net baseline model by leveraging a wide range of deep learning techniques.
arXiv Detail & Related papers (2023-12-27T20:56:55Z) - Network state Estimation using Raw Video Analysis: vQoS-GAN based
non-intrusive Deep Learning Approach [5.8010446129208155]
vQoS GAN can estimate the network state parameters from the degraded received video data.
A robust and unique design of deep learning network model has been trained with the video data along with data rate and packet loss class labels.
The proposed semi supervised generative adversarial network can additionally reconstruct the degraded video data to its original form for a better end user experience.
arXiv Detail & Related papers (2022-03-22T10:42:19Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution.
We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z) - Monocular Depth Estimation Using Multi Scale Neural Network And Feature
Fusion [0.0]
Our network uses two different blocks, first which uses different filter sizes for convolution and merges all the individual feature maps.
The second block uses dilated convolutions in place of fully connected layers thus reducing computations and increasing the receptive field.
We train and test our network on Make 3D dataset, NYU Depth V2 dataset and Kitti dataset using standard evaluation metrics for depth estimation comprised of RMSE loss and SILog loss.
arXiv Detail & Related papers (2020-09-11T18:08:52Z) - Medical Image Segmentation Using a U-Net type of Architecture [0.0]
We argue that the architecture of U-Net, when combined with a supervised training strategy at the bottleneck layer, can produce comparable results with the original U-Net architecture.
We introduce a fully supervised FC layers based pixel-wise loss at the bottleneck of the encoder branch of U-Net.
The two layer based FC sub-net will train the bottleneck representation to contain more semantic information, which will be used by the decoder layers to predict the final segmentation map.
arXiv Detail & Related papers (2020-05-11T16:10:18Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Learning the Loss Functions in a Discriminative Space for Video
Restoration [48.104095018697556]
We propose a new framework for building effective loss functions by learning a discriminative space specific to a video restoration task.
Our framework is similar to GANs in that we iteratively train two networks - a generator and a loss network.
Experiments on video superresolution and deblurring show that our method generates visually more pleasing videos.
arXiv Detail & Related papers (2020-03-20T06:58:27Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.