Related papers: A Loss Function for Generative Neural Networks Based on Watson's Perceptual Model

A Loss Function for Generative Neural Networks Based on Watson's Perceptual Model

URL: http://arxiv.org/abs/2006.15057v3
Date: Wed, 6 Jan 2021 11:16:21 GMT
Title: A Loss Function for Generative Neural Networks Based on Watson's Perceptual Model
Authors: Steffen Czolbe, Oswin Krause, Ingemar Cox, Christian Igel
Abstract summary: To train Variational Autoencoders (VAEs) to generate realistic imagery requires a loss function that reflects human perception of image similarity. We propose such a loss function based on Watson's perceptual model, which computes a weighted distance in frequency space and accounts for luminance and contrast masking. In experiments, VAEs trained with the new loss function generated realistic, high-quality image samples.
Score: 14.1081872409308
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To train Variational Autoencoders (VAEs) to generate realistic imagery requires a loss function that reflects human perception of image similarity. We propose such a loss function based on Watson's perceptual model, which computes a weighted distance in frequency space and accounts for luminance and contrast masking. We extend the model to color images, increase its robustness to translation by using the Fourier Transform, remove artifacts due to splitting the image into blocks, and make it differentiable. In experiments, VAEs trained with the new loss function generated realistic, high-quality image samples. Compared to using the Euclidean distance and the Structural Similarity Index, the images were less blurry; compared to deep neural network based losses, the new approach required less computational resources and generated images with less artifacts.

Related papers

Equipping Diffusion Models with Differentiable Spatial Entropy for Low-Light Image Enhancement [7.302792947244082]
In this work, we propose a novel method that shifts the focus from a deterministic pixel-by-pixel comparison to a statistical perspective. The core idea is to introduce spatial entropy into the loss function to measure the distribution difference between predictions and targets. Specifically, we equip the entropy with diffusion models and aim for superior accuracy and enhanced perceptual quality over l1 based noise matching loss.
arXiv Detail & Related papers (2024-04-15T12:35:10Z)
Multi-Scale Texture Loss for CT denoising with GANs [0.9349653765341301]
We present a novel approach to capture and embed multi-scale texture information into the loss function. Our method introduces a differentiable multi-scale texture representation of the images dynamically aggregated by a self-attention layer. We validate our approach by carrying out extensive experiments in the context of low-dose CT denoising.
arXiv Detail & Related papers (2024-03-25T11:28:52Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics. By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information. One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z)
Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data. Transformers have shown significant performance gains on natural language and high-level vision tasks. Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z)
Just Noticeable Difference for Machine Perception and Generation of Regularized Adversarial Images with Minimal Perturbation [8.920717493647121]
We introduce a measure for machine perception inspired by the concept of Just Noticeable Difference (JND) of human perception. We suggest an adversarial image generation algorithm, which iteratively distorts an image by an additive noise until the machine learning model detects the change in the image by outputting a false label. We evaluate the adversarial images generated by our algorithm both qualitatively and quantitatively on CIFAR10, ImageNet, and MS COCO datasets.
arXiv Detail & Related papers (2021-02-16T11:01:55Z)
Neural Re-Rendering of Humans from a Single Image [80.53438609047896]
We propose a new method for neural re-rendering of a human under a novel user-defined pose and viewpoint. Our algorithm represents body pose and shape as a parametric mesh which can be reconstructed from a single image.
arXiv Detail & Related papers (2021-01-11T18:53:47Z)
Projected Distribution Loss for Image Enhancement [15.297569497776374]
We show that aggregating 1D-Wasserstein distances between CNN activations is more reliable than the existing approaches. In imaging applications such as denoising, super-resolution, demosaicing, deblurring and JPEG artifact removal, the proposed learning loss outperforms the current state-of-the-art on reference-based perceptual losses.
arXiv Detail & Related papers (2020-12-16T22:13:03Z)
Exploring Intensity Invariance in Deep Neural Networks for Brain Image Registration [0.0]
We investigate the effect of intensity distribution among input image pairs for deep learning-based image registration methods. Deep learning models trained with structure similarity-based loss seems to perform better for both datasets.
arXiv Detail & Related papers (2020-09-21T17:49:03Z)
Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision. We propose a novel blind image restoration method, aiming to integrate both the advantages of them. Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z)
Neural Sparse Representation for Image Restoration [116.72107034624344]
Inspired by the robustness and efficiency of sparse coding based image restoration models, we investigate the sparsity of neurons in deep networks. Our method structurally enforces sparsity constraints upon hidden neurons. Experiments show that sparse representation is crucial in deep neural networks for multiple image restoration tasks.
arXiv Detail & Related papers (2020-06-08T05:15:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.