Trustworthy SR: Resolving Ambiguity in Image Super-resolution via
Diffusion Models and Human Feedback
- URL: http://arxiv.org/abs/2402.07597v1
- Date: Mon, 12 Feb 2024 11:55:02 GMT
- Title: Trustworthy SR: Resolving Ambiguity in Image Super-resolution via
Diffusion Models and Human Feedback
- Authors: Cansu Korkmaz, Ege Cirakman, A. Murat Tekalp, Zafer Dogan
- Abstract summary: Super-resolution (SR) is an ill-posed inverse problem with a large set of feasible solutions that are consistent with a given low-resolution image.
We propose employing human feedback, where we ask human subjects to select a small number of likely samples and we ensemble the averages of selected samples.
Our proposed strategy provides more trustworthy solutions when compared to state-of-the art SR methods.
- Score: 5.665865832321032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Super-resolution (SR) is an ill-posed inverse problem with a large set of
feasible solutions that are consistent with a given low-resolution image.
Various deterministic algorithms aim to find a single solution that balances
fidelity and perceptual quality; however, this trade-off often causes visual
artifacts that bring ambiguity in information-centric applications. On the
other hand, diffusion models (DMs) excel in generating a diverse set of
feasible SR images that span the solution space. The challenge is then how to
determine the most likely solution among this set in a trustworthy manner. We
observe that quantitative measures, such as PSNR, LPIPS, DISTS, are not
reliable indicators to resolve ambiguous cases. To this effect, we propose
employing human feedback, where we ask human subjects to select a small number
of likely samples and we ensemble the averages of selected samples. This
strategy leverages the high-quality image generation capabilities of DMs, while
recognizing the importance of obtaining a single trustworthy solution,
especially in use cases, such as identification of specific digits or letters,
where generating multiple feasible solutions may not lead to a reliable
outcome. Experimental results demonstrate that our proposed strategy provides
more trustworthy solutions when compared to state-of-the art SR methods.
Related papers
- Learning from Multi-Perception Features for Real-Word Image
Super-resolution [87.71135803794519]
We propose a novel SR method called MPF-Net that leverages multiple perceptual features of input images.
Our method incorporates a Multi-Perception Feature Extraction (MPFE) module to extract diverse perceptual information.
We also introduce a contrastive regularization term (CR) that improves the model's learning capability.
arXiv Detail & Related papers (2023-05-26T07:35:49Z) - Perception-Distortion Trade-off in the SR Space Spanned by Flow Models [21.597478894658263]
Flow-based generative super-resolution (SR) models learn to produce a diverse set of feasible SR solutions, called the SR space.
We present a simple but effective image ensembling/fusion approach to obtain a single SR image eliminating random artifacts and improving fidelity without significantly compromising perceptual quality.
arXiv Detail & Related papers (2022-09-18T13:12:21Z) - Learning Resolution-Adaptive Representations for Cross-Resolution Person
Re-Identification [49.57112924976762]
Cross-resolution person re-identification problem aims to match low-resolution (LR) query identity images against high resolution (HR) gallery images.
It is a challenging and practical problem since the query images often suffer from resolution degradation due to the different capturing conditions from real-world cameras.
This paper explores an alternative SR-free paradigm to directly compare HR and LR images via a dynamic metric, which is adaptive to the resolution of a query image.
arXiv Detail & Related papers (2022-07-09T03:49:51Z) - An Analysis of Generative Methods for Multiple Image Inpainting [4.234843176066354]
Inpainting refers to the restoration of an image with missing regions in a way that is not detectable by the observer.
We focus on learning-based image completion methods for multiple and diverse inpainting.
arXiv Detail & Related papers (2022-05-04T15:54:08Z) - Single Image Internal Distribution Measurement Using Non-Local
Variational Autoencoder [11.985083962982909]
This paper proposes a novel image-specific solution, namely non-local variational autoencoder (textttNLVAE)
textttNLVAE is introduced as a self-supervised strategy that reconstructs high-resolution images using disentangled information from the non-local neighbourhood.
Experimental results from seven benchmark datasets demonstrate the effectiveness of the textttNLVAE model.
arXiv Detail & Related papers (2022-04-02T18:43:55Z) - HIME: Efficient Headshot Image Super-Resolution with Multiple Exemplars [11.81364643562714]
We propose an efficient Headshot Image Super-Resolution with Multiple Exemplars network (HIME) method.
Compared with previous methods, our network can effectively handle the misalignment between the input and the reference.
We also propose a correlation loss that provides a rich representation of the local texture in a controllable spatial range.
arXiv Detail & Related papers (2022-03-28T16:13:28Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z) - Invertible Image Rescaling [118.2653765756915]
We develop an Invertible Rescaling Net (IRN) to produce visually-pleasing low-resolution images.
We capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process.
arXiv Detail & Related papers (2020-05-12T09:55:53Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z) - Cross-Resolution Adversarial Dual Network for Person Re-Identification
and Beyond [59.149653740463435]
Person re-identification (re-ID) aims at matching images of the same person across camera views.
Due to varying distances between cameras and persons of interest, resolution mismatch can be expected.
We propose a novel generative adversarial network to address cross-resolution person re-ID.
arXiv Detail & Related papers (2020-02-19T07:21:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.