Variational Bayesian Framework for Advanced Image Generation with
Domain-Related Variables
- URL: http://arxiv.org/abs/2305.13872v1
- Date: Tue, 23 May 2023 09:47:23 GMT
- Title: Variational Bayesian Framework for Advanced Image Generation with
Domain-Related Variables
- Authors: Yuxiao Li, Santiago Mazuelas, Yuan Shen
- Abstract summary: We present a unified Bayesian framework for advanced conditional generative problems.
We propose a variational Bayesian image translation network (VBITN) that enables multiple image translation and editing tasks.
- Score: 29.827191184889898
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative models (DGMs) and their conditional counterparts provide a
powerful ability for general-purpose generative modeling of data distributions.
However, it remains challenging for existing methods to address advanced
conditional generative problems without annotations, which can enable multiple
applications like image-to-image translation and image editing. We present a
unified Bayesian framework for such problems, which introduces an inference
stage on latent variables within the learning process. In particular, we
propose a variational Bayesian image translation network (VBITN) that enables
multiple image translation and editing tasks. Comprehensive experiments show
the effectiveness of our method on unsupervised image-to-image translation, and
demonstrate the novel advanced capabilities for semantic editing and mixed
domain translation.
Related papers
- Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing [60.730661748555214]
We introduce textbfTask-textbfOriented textbfDiffusion textbfInversion (textbfTODInv), a novel framework that inverts and edits real images tailored to specific editing tasks.
ToDInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability.
arXiv Detail & Related papers (2024-08-23T22:16:34Z) - StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation [18.213286385769525]
CycleGAN-based methods are known to hide the mismatched information in the generated images to bypass cycle consistency objectives.
We introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images.
Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision.
arXiv Detail & Related papers (2024-03-29T12:23:58Z) - Improving Diffusion-based Image Translation using Asymmetric Gradient
Guidance [51.188396199083336]
We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance.
Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models.
Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
arXiv Detail & Related papers (2023-06-07T12:56:56Z) - Vector Quantized Image-to-Image Translation [31.65282783830092]
We propose introducing the vector quantization technique into the image-to-image translation framework.
Our framework achieves comparable performance to the state-of-the-art image-to-image translation and image extension methods.
arXiv Detail & Related papers (2022-07-27T04:22:29Z) - End-to-End Visual Editing with a Generatively Pre-Trained Artist [78.5922562526874]
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change.
We propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain.
We show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture.
arXiv Detail & Related papers (2022-05-03T17:59:30Z) - Unsupervised Image-to-Image Translation with Generative Prior [103.54337984566877]
Unsupervised image-to-image translation aims to learn the translation between two visual domains without paired data.
We present a novel framework, Generative Prior-guided UN Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.
arXiv Detail & Related papers (2022-04-07T17:59:23Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Learning by Planning: Language-Guided Global Image Editing [53.72807421111136]
We develop a text-to-operation model to map the vague editing language request into a series of editing operations.
The only supervision in the task is the target image, which is insufficient for a stable training of sequential decisions.
We propose a novel operation planning algorithm to generate possible editing sequences from the target image as pseudo ground truth.
arXiv Detail & Related papers (2021-06-24T16:30:03Z) - Network-to-Network Translation with Conditional Invertible Neural
Networks [19.398202091883366]
Recent work suggests that the power of massive machine learning models is captured by the representations they learn.
We seek a model that can relate between different existing representations and propose to solve this task with a conditionally invertible network.
Our domain transfer network can translate between fixed representations without having to learn or finetune them.
arXiv Detail & Related papers (2020-05-27T18:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.