Fine-grained Semantic Constraint in Image Synthesis
- URL: http://arxiv.org/abs/2101.04558v1
- Date: Tue, 12 Jan 2021 15:51:49 GMT
- Title: Fine-grained Semantic Constraint in Image Synthesis
- Authors: Pengyang Li and Donghui Wang
- Abstract summary: We propose a multi-stage and high-resolution model for image synthesis that uses fine-grained attributes and masks as input.
With mask as prior, the model in this paper is constrained so that the generated images conform to visual senses.
This paper also proposes a scheme to improve the discriminator of the generative adversarial network by simultaneously discriminating the total image and sub-regions of the image.
- Score: 8.22379888383833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a multi-stage and high-resolution model for image
synthesis that uses fine-grained attributes and masks as input. With a
fine-grained attribute, the proposed model can detailedly constrain the
features of the generated image through rich and fine-grained semantic
information in the attribute. With mask as prior, the model in this paper is
constrained so that the generated images conform to visual senses, which will
reduce the unexpected diversity of samples generated from the generative
adversarial network. This paper also proposes a scheme to improve the
discriminator of the generative adversarial network by simultaneously
discriminating the total image and sub-regions of the image. In addition, we
propose a method for optimizing the labeled attribute in datasets, which
reduces the manual labeling noise. Extensive quantitative results show that our
image synthesis model generates more realistic images.
Related papers
- FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image.
We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z) - Diverse and Tailored Image Generation for Zero-shot Multi-label Classification [3.354528906571718]
zero-shot multi-label classification has garnered considerable attention for its capacity to operate predictions on unseen labels without human annotations.
prevailing approaches often use seen classes as imperfect proxies for unseen ones, resulting in suboptimal performance.
We propose an innovative solution: generating synthetic data to construct a training set explicitly tailored for proxyless training on unseen labels.
arXiv Detail & Related papers (2024-04-04T01:34:36Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale
Fine-Grained Image Retrieval [65.43522019468976]
We propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes.
We develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors.
Our models are equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities.
arXiv Detail & Related papers (2023-11-21T08:20:38Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Hierarchical Text-Conditional Image Generation with CLIP Latents [20.476720970770128]
We show that explicitly generating image representations improves image diversity with minimal loss in photorealism and caption similarity.
Our decoders conditioned on image representations can also produce variations of an image that preserve both its semantics and style.
arXiv Detail & Related papers (2022-04-13T01:10:33Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Multi-Image Summarization: Textual Summary from a Set of Cohesive Images [17.688344968462275]
This paper proposes the new task of multi-image summarization.
It aims to generate a concise and descriptive textual summary given a coherent set of input images.
A dense average image feature aggregation network allows the model to focus on a coherent subset of attributes.
arXiv Detail & Related papers (2020-06-15T18:45:35Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.