Related papers: Improving Image Autoencoder Embeddings with Perceptual Loss

Improving Image Autoencoder Embeddings with Perceptual Loss

URL: http://arxiv.org/abs/2001.03444v2
Date: Fri, 3 Apr 2020 09:39:35 GMT
Title: Improving Image Autoencoder Embeddings with Perceptual Loss
Authors: Gustav Grund Pihlgren (1), Fredrik Sandin (1), Marcus Liwicki (1) ((1) Lule\r{a} University of Technology)
Abstract summary: This work investigates perceptual loss from the perspective of encoder embeddings themselves. Autoencoders are trained to embed images from three different computer vision datasets using perceptual loss. Results show that, on the task of object positioning of a small-scale feature, perceptual loss can improve the results by a factor 10.
Score: 0.1529342790344802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autoencoders are commonly trained using element-wise loss. However, element-wise loss disregards high-level structures in the image which can lead to embeddings that disregard them as well. A recent improvement to autoencoders that helps alleviate this problem is the use of perceptual loss. This work investigates perceptual loss from the perspective of encoder embeddings themselves. Autoencoders are trained to embed images from three different computer vision datasets using perceptual loss based on a pretrained model as well as pixel-wise loss. A host of different predictors are trained to perform object positioning and classification on the datasets given the embedded images as input. The two kinds of losses are evaluated by comparing how the predictors performed with embeddings from the differently trained autoencoders. The results show that, in the image domain, the embeddings generated by autoencoders trained with perceptual loss enable more accurate predictions than those trained with element-wise loss. Furthermore, the results show that, on the task of object positioning of a small-scale feature, perceptual loss can improve the results by a factor 10. The experimental setup is available online: https://github.com/guspih/Perceptual-Autoencoders

Related papers

Exploring Compressed Image Representation as a Perceptual Proxy: A Study [1.0878040851638]
We propose an end-to-end learned image compression wherein the analysis transform is jointly trained with an object classification task. This study affirms that the compressed latent representation can predict human perceptual distance judgments with an accuracy comparable to a custom-tailored DNN-based quality metric.
arXiv Detail & Related papers (2024-01-14T04:37:17Z)
Human-imperceptible, Machine-recognizable Images [76.01951148048603]
A major conflict is exposed relating to software engineers between better developing AI systems and distancing from the sensitive training data. This paper proposes an efficient privacy-preserving learning paradigm, where images are encrypted to become human-imperceptible, machine-recognizable'' We show that the proposed paradigm can ensure the encrypted images have become human-imperceptible while preserving machine-recognizable information.
arXiv Detail & Related papers (2023-06-06T13:41:37Z)
Unlocking Masked Autoencoders as Loss Function for Image and Video Restoration [19.561055022474786]
We study the potential of loss and raise our belief learned loss function empowers the learning capability of neural networks for image and video restoration'' We investigate the efficacy of our belief from three perspectives: 1) from task-customized MAE to native MAE, 2) from image task to video task, and 3) from transformer structure to convolution neural network structure.
arXiv Detail & Related papers (2023-03-29T02:41:08Z)
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods [91.54785981649228]
This paper focuses on non-linear two-layer autoencoders trained in the challenging proportional regime. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders.
arXiv Detail & Related papers (2022-12-27T12:37:34Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Masked Autoencoders Are Scalable Vision Learners [60.97703494764904]
Masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. Coupling these two designs enables us to train large models efficiently and effectively.
arXiv Detail & Related papers (2021-11-11T18:46:40Z)
EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning [27.54202989524394]
We proposeMI, the first membership inference method against image encoders pre-trained by contrastive learning. We evaluateMI on image encoders pre-trained on multiple datasets by ourselves as well as the Contrastive Language-Image Pre-training (CLIP) image encoder, which is pre-trained on 400 million (image, text) pairs collected from the Internet and released by OpenAI.
arXiv Detail & Related papers (2021-08-25T03:00:45Z)
A Variational Auto-Encoder Approach for Image Transmission in Wireless Channel [4.82810058837951]
We investigate the performance of variational auto-encoders and compare the results with standard auto-encoders. Our experiments demonstrate that the SSIM metric visually improves the quality of the reconstructed images at the receiver.
arXiv Detail & Related papers (2020-10-08T13:35:38Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
Pretraining Image Encoders without Reconstruction via Feature Prediction Loss [0.1529342790344802]
This work investigates three methods for calculating loss for autoencoder-based pretraining of image encoders. We propose to decode the features of the loss network, hence the name "feature prediction loss"
arXiv Detail & Related papers (2020-03-16T21:08:43Z)
Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring. We present a differentiable reblur model for self-supervised motion deblurring. Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.