Learning the Loss Functions in a Discriminative Space for Video
Restoration
- URL: http://arxiv.org/abs/2003.09124v1
- Date: Fri, 20 Mar 2020 06:58:27 GMT
- Title: Learning the Loss Functions in a Discriminative Space for Video
Restoration
- Authors: Younghyun Jo, Jaeyeon Kang, Seoung Wug Oh, Seonghyeon Nam, Peter
Vajda, and Seon Joo Kim
- Abstract summary: We propose a new framework for building effective loss functions by learning a discriminative space specific to a video restoration task.
Our framework is similar to GANs in that we iteratively train two networks - a generator and a loss network.
Experiments on video superresolution and deblurring show that our method generates visually more pleasing videos.
- Score: 48.104095018697556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With more advanced deep network architectures and learning schemes such as
GANs, the performance of video restoration algorithms has greatly improved
recently. Meanwhile, the loss functions for optimizing deep neural networks
remain relatively unchanged. To this end, we propose a new framework for
building effective loss functions by learning a discriminative space specific
to a video restoration task. Our framework is similar to GANs in that we
iteratively train two networks - a generator and a loss network. The generator
learns to restore videos in a supervised fashion, by following ground truth
features through the feature matching in the discriminative space learned by
the loss network. In addition, we also introduce a new relation loss in order
to maintain the temporal consistency in output videos. Experiments on video
superresolution and deblurring show that our method generates visually more
pleasing videos with better quantitative perceptual metric values than the
other state-of-the-art methods.
Related papers
- Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video
Retrieval [67.52910255064762]
We design a simple dual-stream structure, including a temporal layer and a hash layer.
We first design a simple dual-stream structure, including a temporal layer and a hash layer.
With the help of semantic similarity knowledge obtained from self-supervision, the hash layer learns to capture information for semantic retrieval.
In this way, the model naturally preserves the disentangled semantics into binary codes.
arXiv Detail & Related papers (2023-10-12T03:21:12Z) - Video Event Restoration Based on Keyframes for Video Anomaly Detection [9.18057851239942]
Existing deep neural network based anomaly detection (VAD) methods mostly follow the route of frame reconstruction or frame prediction.
We introduce a brand-new VAD paradigm to break through these limitations.
We propose a novel U-shaped Swin Transformer Network with Dual Skip Connections (USTN-DSC) for video event restoration.
arXiv Detail & Related papers (2023-04-11T10:13:19Z) - Unlocking Masked Autoencoders as Loss Function for Image and Video
Restoration [19.561055022474786]
We study the potential of loss and raise our belief learned loss function empowers the learning capability of neural networks for image and video restoration''
We investigate the efficacy of our belief from three perspectives: 1) from task-customized MAE to native MAE, 2) from image task to video task, and 3) from transformer structure to convolution neural network structure.
arXiv Detail & Related papers (2023-03-29T02:41:08Z) - CRC-RL: A Novel Visual Feature Representation Architecture for
Unsupervised Reinforcement Learning [7.4010632660248765]
A novel architecture is proposed that uses a heterogeneous loss function, called CRC loss, to learn improved visual features.
The proposed architecture, called CRC-RL, is shown to outperform the existing state-of-the-art methods on the challenging Deep mind control suite environments.
arXiv Detail & Related papers (2023-01-31T08:41:18Z) - Structured Sparsity Learning for Efficient Video Super-Resolution [99.1632164448236]
We develop a structured pruning scheme called Structured Sparsity Learning (SSL) according to the properties of video super-resolution (VSR) models.
In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.
arXiv Detail & Related papers (2022-06-15T17:36:04Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - An Efficient Recurrent Adversarial Framework for Unsupervised Real-Time
Video Enhancement [132.60976158877608]
We propose an efficient adversarial video enhancement framework that learns directly from unpaired video examples.
In particular, our framework introduces new recurrent cells that consist of interleaved local and global modules for implicit integration of spatial and temporal information.
The proposed design allows our recurrent cells to efficiently propagate-temporal-information across frames and reduces the need for high complexity networks.
arXiv Detail & Related papers (2020-12-24T00:03:29Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z) - HRVGAN: High Resolution Video Generation using Spatio-Temporal GAN [0.0]
We present a novel network for high resolution video generation.
Our network uses ideas from Wasserstein GANs by enforcing k-Lipschitz constraint on the loss term and Conditional GANs using class labels for training and testing.
arXiv Detail & Related papers (2020-08-17T20:45:59Z) - iSeeBetter: Spatio-temporal video super-resolution using recurrent
generative back-projection networks [0.0]
We present iSeeBetter, a novel GAN-based structural-temporal approach to video super-resolution (VSR)
iSeeBetter extracts spatial and temporal information from the current and neighboring frames using the concept of recurrent back-projection networks as its generator.
Our results demonstrate that iSeeBetter offers superior VSR fidelity and surpasses state-of-the-art performance.
arXiv Detail & Related papers (2020-06-13T01:36:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.