Translate the Facial Regions You Like Using Region-Wise Normalization
- URL: http://arxiv.org/abs/2007.14615v1
- Date: Wed, 29 Jul 2020 05:55:49 GMT
- Title: Translate the Facial Regions You Like Using Region-Wise Normalization
- Authors: Wenshuang Liu, Wenting Chen, Linlin Shen
- Abstract summary: We propose a region-wise normalization framework, for region level face translation.
Both shape and texture of different regions can thus be translated to various target styles.
Our approach has further advantages in precise control of the regions to be translated.
- Score: 27.288255234645472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Though GAN (Generative Adversarial Networks) based technique has greatly
advanced the performance of image synthesis and face translation, only few
works available in literature provide region based style encoding and
translation. We propose in this paper a region-wise normalization framework,
for region level face translation. While per-region style is encoded using
available approach, we build a so called RIN (region-wise normalization) block
to individually inject the styles into per-region feature maps and then fuse
them for following convolution and upsampling. Both shape and texture of
different regions can thus be translated to various target styles. A region
matching loss has also been proposed to significantly reduce the inference
between regions during the translation process. Extensive experiments on three
publicly available datasets, i.e. Morph, RaFD and CelebAMask-HQ, suggest that
our approach demonstrate a large improvement over state-of-the-art methods like
StarGAN, SEAN and FUNIT. Our approach has further advantages in precise control
of the regions to be translated. As a result, region level expression changes
and step by step make up can be achieved. The video demo is available at
https://youtu.be/ceRqsbzXAfk.
Related papers
- RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection [20.630629383286262]
Open-vocabulary object detection requires solid modeling of the region-semantic relationship.
We propose RTGen to generate scalable open-vocabulary region-text pairs.
arXiv Detail & Related papers (2024-05-30T09:03:23Z) - Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning [39.85737063875394]
This study develops a novel end-to-end CMFD framework that integrates the strengths of conventional and deep learning methods.
Unlike existing deep models, our approach utilizes features extracted from high-resolution scales to seek explicit and reliable point-to-point matching.
By leveraging the strong prior of point-to-point matches, the framework can identify subtle differences and effectively discriminate between source and target regions.
arXiv Detail & Related papers (2024-04-26T10:38:17Z) - CLIM: Contrastive Language-Image Mosaic for Region Representation [58.05870131126816]
Contrastive Language-Image Mosaic (CLIM) is a novel approach for aligning region and text representations.
CLIM consistently improves different open-vocabulary object detection methods.
It can effectively enhance the region representation of vision-language models.
arXiv Detail & Related papers (2023-12-18T17:39:47Z) - SARA: Controllable Makeup Transfer with Spatial Alignment and Region-Adaptive Normalization [67.90315365909244]
We propose a novel Spatial Alignment and Region-Adaptive normalization method (SARA) in this paper.
Our method generates detailed makeup transfer results that can handle large spatial misalignments and achieve part-specific and shade-controllable makeup transfer.
Experimental results show that our SARA method outperforms existing methods and achieves state-of-the-art performance on two public datasets.
arXiv Detail & Related papers (2023-11-28T14:46:51Z) - Towards Robust Scene Text Image Super-resolution via Explicit Location
Enhancement [59.66539728681453]
Scene text image super-resolution (STISR) aims to improve image quality while boosting downstream scene text recognition accuracy.
Most existing methods treat the foreground (character regions) and background (non-character regions) equally in the forward process.
We propose a novel method LEMMA that explicitly models character regions to produce high-level text-specific guidance for super-resolution.
arXiv Detail & Related papers (2023-07-19T05:08:47Z) - Region-Aware Diffusion for Zero-shot Text-driven Image Editing [78.58917623854079]
We propose a novel region-aware diffusion model (RDM) for entity-level image editing.
To strike a balance between image fidelity and inference speed, we design the intensive diffusion pipeline.
The results show that RDM outperforms the previous approaches in terms of visual quality, overall harmonization, non-editing region content preservation, and text-image semantic consistency.
arXiv Detail & Related papers (2023-02-23T06:20:29Z) - Semantic Segmentation by Early Region Proxy [53.594035639400616]
We present a novel and efficient modeling that starts from interpreting the image as a tessellation of learnable regions.
To model region-wise context, we exploit Transformer to encode regions in a sequence-to-sequence manner.
Semantic segmentation is now carried out as per-region prediction on top of the encoded region embeddings.
arXiv Detail & Related papers (2022-03-26T10:48:32Z) - RegionCLIP: Region-based Language-Image Pretraining [94.29924084715316]
Contrastive language-image pretraining (CLIP) using image-text pairs has achieved impressive results on image classification.
We propose a new method called RegionCLIP that significantly extends CLIP to learn region-level visual representations.
Our method significantly outperforms the state of the art by 3.8 AP50 and 2.2 AP for novel categories on COCO and LVIS datasets.
arXiv Detail & Related papers (2021-12-16T18:39:36Z) - Semi-supervised Synthesis of High-Resolution Editable Textures for 3D
Humans [14.098628848491147]
We introduce a novel approach to generate diverse high fidelity texture maps for 3D human meshes in a semi-supervised setup.
Given a segmentation mask defining the layout of the semantic regions in the texture map, our network generates high-resolution textures with a variety of styles, that are then used for rendering purposes.
arXiv Detail & Related papers (2021-03-31T17:58:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.