Related papers: Image Shape Manipulation from a Single Augmented Training Sample

Image Shape Manipulation from a Single Augmented Training Sample

URL: http://arxiv.org/abs/2109.06151v1
Date: Mon, 13 Sep 2021 17:44:04 GMT
Title: Image Shape Manipulation from a Single Augmented Training Sample
Authors: Yael Vinker, Eliahu Horwitz, Nir Zabari, Yedid Hoshen
Abstract summary: DeepSIM is a generative model for conditional image manipulation based on a single image. Our network learns to map between a primitive representation of the image to the image itself.
Score: 26.342929563689218
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we present DeepSIM, a generative model for conditional image manipulation based on a single image. We find that extensive augmentation is key for enabling single image training, and incorporate the use of thin-plate-spline (TPS) as an effective augmentation. Our network learns to map between a primitive representation of the image to the image itself. The choice of a primitive representation has an impact on the ease and expressiveness of the manipulations and can be automatic (e.g. edges), manual (e.g. segmentation) or hybrid such as edges on top of segmentations. At manipulation time, our generator allows for making complex image changes by modifying the primitive input representation and mapping it through the network. Our method is shown to achieve remarkable performance on image manipulation tasks.

Related papers

EditInfinity: Image Editing with Binary-Quantized Generative Models [64.05135380710749]
We investigate the parameter-efficient adaptation of binary-quantized generative models for image editing.<n>Specifically, we propose EditInfinity, which adapts emphInfinity, a binary-quantized generative model, for image editing.<n>We propose an efficient yet effective image inversion mechanism that integrates text prompting rectification and image style preservation.
arXiv Detail & Related papers (2025-10-23T05:06:24Z)
CIMGEN: Controlled Image Manipulation by Finetuning Pretrained Generative Models on Limited Data [14.469539513542584]
A semantic map has information of objects present in the image. One can easily modify the map to selectively insert, remove, or replace objects in the map. The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map.
arXiv Detail & Related papers (2024-01-23T06:30:47Z)
Zero-shot spatial layout conditioning for text-to-image diffusion models [52.24744018240424]
Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling. We consider image generation from text associated with segments on the image canvas, which combines an intuitive natural language interface with precise spatial control over the generated content. We propose ZestGuide, a zero-shot segmentation guidance approach that can be plugged into pre-trained text-to-image diffusion models.
arXiv Detail & Related papers (2023-06-23T19:24:48Z)
Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing. We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights. Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z)
DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation [56.514462874501675]
We propose a dynamic sparse attention based Transformer model to achieve fine-level matching with favorable efficiency. The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on. Experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details.
arXiv Detail & Related papers (2022-07-13T11:12:03Z)
MaskGIT: Masked Generative Image Transformer [49.074967597485475]
MaskGIT learns to predict randomly masked tokens by attending to tokens in all directions. Experiments demonstrate that MaskGIT significantly outperforms the state-of-the-art transformer model on the ImageNet dataset.
arXiv Detail & Related papers (2022-02-08T23:54:06Z)
Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z)
S2FGAN: Semantically Aware Interactive Sketch-to-Face Translation [11.724779328025589]
This paper proposes a sketch-to-image generation framework called S2FGAN. We employ two latent spaces to control the face appearance and adjust the desired attributes of the generated face. Our method successfully outperforms state-of-the-art methods on attribute manipulation by exploiting greater control of attribute intensity.
arXiv Detail & Related papers (2020-11-30T13:42:39Z)
Image Shape Manipulation from a Single Augmented Training Sample [24.373900721120286]
DeepSIM is a generative model for conditional image manipulation based on a single image. Our network learns to map between a primitive representation of the image to the image itself.
arXiv Detail & Related papers (2020-07-02T17:55:27Z)
Training End-to-end Single Image Generators without GANs [27.393821783237186]
AugurOne is a novel approach for training single image generative models. Our approach trains an upscaling neural network using non-affine augmentations of the (single) input image. A compact latent space is jointly learned allowing for controlled image synthesis.
arXiv Detail & Related papers (2020-04-07T17:58:03Z)
Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image. We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image. Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.