Learning GAN-based Foveated Reconstruction to Recover Perceptually
Important Image Features
- URL: http://arxiv.org/abs/2108.03499v3
- Date: Mon, 17 Apr 2023 16:42:28 GMT
- Title: Learning GAN-based Foveated Reconstruction to Recover Perceptually
Important Image Features
- Authors: Luca Surace (Universit\`a della Svizzera italiana), Marek Wernikowski
(West Pomeranian University of Technology), Cara Tursun (Universit\`a della
Svizzera italiana and University of Groningen), Karol Myszkowski (Max Planck
Institute for Informatics), Rados{\l}aw Mantiuk (West Pomeranian University
of Technology), Piotr Didyk (Universit\`a della Svizzera italiana)
- Abstract summary: We consider the problem of efficiently guiding the training of foveated reconstruction techniques.
Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect.
Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A foveated image can be entirely reconstructed from a sparse set of samples
distributed according to the retinal sensitivity of the human visual system,
which rapidly decreases with increasing eccentricity. The use of Generative
Adversarial Networks has recently been shown to be a promising solution for
such a task, as they can successfully hallucinate missing image information. As
in the case of other supervised learning approaches, the definition of the loss
function and the training strategy heavily influence the quality of the output.
In this work,we consider the problem of efficiently guiding the training of
foveated reconstruction techniques such that they are more aware of the
capabilities and limitations of the human visual system, and thus can
reconstruct visually important image features. Our primary goal is to make the
training procedure less sensitive to distortions that humans cannot detect and
focus on penalizing perceptually important artifacts. Given the nature of
GAN-based solutions, we focus on the sensitivity of human vision to
hallucination in case of input samples with different densities. We propose
psychophysical experiments, a dataset, and a procedure for training foveated
image reconstruction. The proposed strategy renders the generator network
flexible by penalizing only perceptually important deviations in the output. As
a result, the method emphasized the recovery of perceptually important image
features. We evaluated our strategy and compared it with alternative solutions
by using a newly trained objective metric, a recent foveated video quality
metric, and user experiments. Our evaluations revealed significant improvements
in the perceived image reconstruction quality compared with the standard
GAN-based training approach.
Related papers
- Improving Neural Surface Reconstruction with Feature Priors from Multi-View Image [87.00660347447494]
Recent advancements in Neural Surface Reconstruction (NSR) have significantly improved multi-view reconstruction when coupled with volume rendering.
We propose an investigation into feature-level consistent loss, aiming to harness valuable feature priors from diverse pretext visual tasks.
Our results, analyzed on DTU and EPFL, reveal that feature priors from image matching and multi-view stereo datasets outperform other pretext tasks.
arXiv Detail & Related papers (2024-08-04T16:09:46Z) - Analysis of Deep Image Prior and Exploiting Self-Guidance for Image
Reconstruction [13.277067849874756]
We study how DIP recovers information from undersampled imaging measurements.
We introduce a self-driven reconstruction process that concurrently optimize both the network weights and the input.
Our method incorporates a novel denoiser regularization term which enables robust and stable joint estimation of both the network input and reconstructed image.
arXiv Detail & Related papers (2024-02-06T15:52:23Z) - Iterative-in-Iterative Super-Resolution Biomedical Imaging Using One
Real Image [8.412910029745762]
We propose an approach to train the deep learning-based super-resolution models using only one real image.
We employ a mixed metric of image screening to automatically select images with a distribution similar to ground truth.
After five training iterations, the proposed deep learning-based super-resolution model experienced a 7.5% and 5.49% improvement in structural similarity and peak-signal-to-noise ratio.
arXiv Detail & Related papers (2023-06-26T07:57:03Z) - Domain-Aware Few-Shot Learning for Optical Coherence Tomography Noise
Reduction [0.0]
We propose a few-shot supervised learning framework for optical coherence tomography ( OCT) noise reduction.
This framework offers a dramatic increase in training speed and requires only a single image, or part of an image, and a corresponding speckle suppressed ground truth.
Our results demonstrate significant potential for improving sample complexity, generalization, and time efficiency.
arXiv Detail & Related papers (2023-06-13T19:46:40Z) - Generalizable Denoising of Microscopy Images using Generative
Adversarial Networks and Contrastive Learning [0.0]
We propose a novel framework for few-shot microscopy image denoising.
Our approach combines a generative adversarial network (GAN) trained via contrastive learning (CL) with two structure preserving loss terms.
We demonstrate the effectiveness of our method on three well-known microscopy imaging datasets.
arXiv Detail & Related papers (2023-03-27T13:55:07Z) - Real-World Image Super-Resolution by Exclusionary Dual-Learning [98.36096041099906]
Real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input.
Deep learning-based methods have achieved promising restoration quality on real-world image super-resolution datasets.
We propose Real-World image Super-Resolution by Exclusionary Dual-Learning (RWSR-EDL) to address the feature diversity in perceptual- and L1-based cooperative learning.
arXiv Detail & Related papers (2022-06-06T13:28:15Z) - Is Deep Image Prior in Need of a Good Education? [57.3399060347311]
Deep image prior was introduced as an effective prior for image reconstruction.
Despite its impressive reconstructive properties, the approach is slow when compared to learned or traditional reconstruction techniques.
We develop a two-stage learning paradigm to address the computational challenge.
arXiv Detail & Related papers (2021-11-23T15:08:26Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z) - Enhancing Perceptual Loss with Adversarial Feature Matching for
Super-Resolution [5.258555266148511]
Single image super-resolution (SISR) is an ill-posed problem with an indeterminate number of valid solutions.
We show that the root cause of these pattern artifacts can be traced back to a mismatch between the pre-training objective of perceptual loss and the super-resolved objective.
arXiv Detail & Related papers (2020-05-15T12:36:54Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.