Related papers: Pretraining Image Encoders without Reconstruction via Feature Prediction Loss

Pretraining Image Encoders without Reconstruction via Feature Prediction Loss

URL: http://arxiv.org/abs/2003.07441v2
Date: Wed, 15 Jul 2020 15:54:22 GMT
Title: Pretraining Image Encoders without Reconstruction via Feature Prediction Loss
Authors: Gustav Grund Pihlgren (1), Fredrik Sandin (1), Marcus Liwicki (1) ((1) Lule\r{a} University of Technology)
Abstract summary: This work investigates three methods for calculating loss for autoencoder-based pretraining of image encoders. We propose to decode the features of the loss network, hence the name "feature prediction loss"
Score: 0.1529342790344802
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work investigates three methods for calculating loss for autoencoder-based pretraining of image encoders: The commonly used reconstruction loss, the more recently introduced deep perceptual similarity loss, and a feature prediction loss proposed here; the latter turning out to be the most efficient choice. Standard auto-encoder pretraining for deep learning tasks is done by comparing the input image and the reconstructed image. Recent work shows that predictions based on embeddings generated by image autoencoders can be improved by training with perceptual loss, i.e., by adding a loss network after the decoding step. So far the autoencoders trained with loss networks implemented an explicit comparison of the original and reconstructed images using the loss network. However, given such a loss network we show that there is no need for the time-consuming task of decoding the entire image. Instead, we propose to decode the features of the loss network, hence the name "feature prediction loss". To evaluate this method we perform experiments on three standard publicly available datasets (LunarLander-v2, STL-10, and SVHN) and compare six different procedures for training image encoders (pixel-wise, perceptual similarity, and feature prediction losses; combined with two variations of image and feature encoding/decoding). The embedding-based prediction results show that encoders trained with feature prediction loss is as good or better than those trained with the other two losses. Additionally, the encoder is significantly faster to train using feature prediction loss in comparison to the other losses. The method implementation used in this work is available online: https://github.com/guspih/Perceptual-Autoencoders

Related papers

Deep Lossless Image Compression via Masked Sampling and Coarse-to-Fine Auto-Regression [8.6984128323386]
We propose a deep lossless image compression via masked sampling and coarse-to-fine auto-regression. It combines lossy reconstruction and progressive residual compression, which fuses contexts from various directions. Our method achieves comparable compression performance on extensive datasets with competitive coding speed and more flexibility.
arXiv Detail & Related papers (2025-03-14T09:29:55Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
Unlocking Masked Autoencoders as Loss Function for Image and Video Restoration [19.561055022474786]
We study the potential of loss and raise our belief learned loss function empowers the learning capability of neural networks for image and video restoration'' We investigate the efficacy of our belief from three perspectives: 1) from task-customized MAE to native MAE, 2) from image task to video task, and 3) from transformer structure to convolution neural network structure.
arXiv Detail & Related papers (2023-03-29T02:41:08Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Is Deep Image Prior in Need of a Good Education? [57.3399060347311]
Deep image prior was introduced as an effective prior for image reconstruction. Despite its impressive reconstructive properties, the approach is slow when compared to learned or traditional reconstruction techniques. We develop a two-stage learning paradigm to address the computational challenge.
arXiv Detail & Related papers (2021-11-23T15:08:26Z)
EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning [27.54202989524394]
We proposeMI, the first membership inference method against image encoders pre-trained by contrastive learning. We evaluateMI on image encoders pre-trained on multiple datasets by ourselves as well as the Contrastive Language-Image Pre-training (CLIP) image encoder, which is pre-trained on 400 million (image, text) pairs collected from the Internet and released by OpenAI.
arXiv Detail & Related papers (2021-08-25T03:00:45Z)
Generic Perceptual Loss for Modeling Structured Output Dependencies [78.59700528239141]
We show that, what matters is the network structure instead of the trained weights. We demonstrate that a randomly-weighted deep CNN can be used to model the structured dependencies of outputs.
arXiv Detail & Related papers (2021-03-18T23:56:07Z)
Learning to Learn to Compress [25.23586503813838]
We present an end-to-end meta-learned system for image compression. We propose a new training paradigm for learned image compression based on meta-learning.
arXiv Detail & Related papers (2020-07-31T13:13:53Z)
Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images. We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
Improving Image Autoencoder Embeddings with Perceptual Loss [0.1529342790344802]
This work investigates perceptual loss from the perspective of encoder embeddings themselves. Autoencoders are trained to embed images from three different computer vision datasets using perceptual loss. Results show that, on the task of object positioning of a small-scale feature, perceptual loss can improve the results by a factor 10.
arXiv Detail & Related papers (2020-01-10T13:48:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.