Projected Distribution Loss for Image Enhancement
- URL: http://arxiv.org/abs/2012.09289v1
- Date: Wed, 16 Dec 2020 22:13:03 GMT
- Title: Projected Distribution Loss for Image Enhancement
- Authors: Mauricio Delbracio, Hossein Talebi, Peyman Milanfar
- Abstract summary: We show that aggregating 1D-Wasserstein distances between CNN activations is more reliable than the existing approaches.
In imaging applications such as denoising, super-resolution, demosaicing, deblurring and JPEG artifact removal, the proposed learning loss outperforms the current state-of-the-art on reference-based perceptual losses.
- Score: 15.297569497776374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Features obtained from object recognition CNNs have been widely used for
measuring perceptual similarities between images. Such differentiable metrics
can be used as perceptual learning losses to train image enhancement models.
However, the choice of the distance function between input and target features
may have a consequential impact on the performance of the trained model. While
using the norm of the difference between extracted features leads to limited
hallucination of details, measuring the distance between distributions of
features may generate more textures; yet also more unrealistic details and
artifacts. In this paper, we demonstrate that aggregating 1D-Wasserstein
distances between CNN activations is more reliable than the existing
approaches, and it can significantly improve the perceptual performance of
enhancement models. More explicitly, we show that in imaging applications such
as denoising, super-resolution, demosaicing, deblurring and JPEG artifact
removal, the proposed learning loss outperforms the current state-of-the-art on
reference-based perceptual losses. This means that the proposed learning loss
can be plugged into different imaging frameworks and produce perceptually
realistic results.
Related papers
- Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Training and Predicting Visual Error for Real-Time Applications [6.687091041822445]
We explore the abilities of convolutional neural networks to predict a variety of visual metrics without requiring either reference or rendered images.
Our solution combines image-space information that is readily available in most state-of-the-art deferred shading pipelines with reprojection from previous frames to enable an adequate estimate of visual errors.
arXiv Detail & Related papers (2023-10-13T14:14:00Z) - ExposureDiffusion: Learning to Expose for Low-light Image Enhancement [87.08496758469835]
This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure model.
Our method obtains significantly improved performance and reduced inference time compared with vanilla diffusion models.
The proposed framework can work with both real-paired datasets, SOTA noise models, and different backbone networks.
arXiv Detail & Related papers (2023-07-15T04:48:35Z) - CbwLoss: Constrained Bidirectional Weighted Loss for Self-supervised
Learning of Depth and Pose [13.581694284209885]
Photometric differences are used to train neural networks for estimating depth and camera pose from unlabeled monocular videos.
In this paper, we deal with moving objects and occlusions utilizing the difference of the flow fields and depth structure generated by affine transformation and view synthesis.
We mitigate the effect of textureless regions on model optimization by measuring differences between features with more semantic and contextual information without adding networks.
arXiv Detail & Related papers (2022-12-12T12:18:24Z) - Deep Semantic Statistics Matching (D2SM) Denoising Network [70.01091467628068]
We introduce the Deep Semantic Statistics Matching (D2SM) Denoising Network.
It exploits semantic features of pretrained classification networks, then it implicitly matches the probabilistic distribution of clear images at the semantic feature space.
By learning to preserve the semantic distribution of denoised images, we empirically find our method significantly improves the denoising capabilities of networks.
arXiv Detail & Related papers (2022-07-19T14:35:42Z) - Do Different Deep Metric Learning Losses Lead to Similar Learned
Features? [4.043200001974071]
We compare 14 pretrained models from a recent study and find that, even though all models perform similarly, different loss functions can guide the model to learn different features.
Our analysis also shows that some seemingly irrelevant properties can have significant influence on the resulting embedding.
arXiv Detail & Related papers (2022-05-05T15:07:19Z) - Enhancing Photorealism Enhancement [83.88433283714461]
We present an approach to enhancing the realism of synthetic images using a convolutional network.
We analyze scene layout distributions in commonly used datasets and find that they differ in important ways.
We report substantial gains in stability and realism in comparison to recent image-to-image translation methods.
arXiv Detail & Related papers (2021-05-10T19:00:49Z) - Generic Perceptual Loss for Modeling Structured Output Dependencies [78.59700528239141]
We show that, what matters is the network structure instead of the trained weights.
We demonstrate that a randomly-weighted deep CNN can be used to model the structured dependencies of outputs.
arXiv Detail & Related papers (2021-03-18T23:56:07Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Image Super-Resolution using Explicit Perceptual Loss [17.2448277365841]
We show how to exploit the machine learning based model which is directly trained to provide the perceptual score on generated images.
The experimental results show the explicit approach has a higher perceptual score than other approaches.
arXiv Detail & Related papers (2020-09-01T12:22:39Z) - A Loss Function for Generative Neural Networks Based on Watson's
Perceptual Model [14.1081872409308]
To train Variational Autoencoders (VAEs) to generate realistic imagery requires a loss function that reflects human perception of image similarity.
We propose such a loss function based on Watson's perceptual model, which computes a weighted distance in frequency space and accounts for luminance and contrast masking.
In experiments, VAEs trained with the new loss function generated realistic, high-quality image samples.
arXiv Detail & Related papers (2020-06-26T15:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.