RoI Tanh-polar Transformer Network for Face Parsing in the Wild
- URL: http://arxiv.org/abs/2102.02717v1
- Date: Thu, 4 Feb 2021 16:25:26 GMT
- Title: RoI Tanh-polar Transformer Network for Face Parsing in the Wild
- Authors: Yiming Lin, Jie Shen, Yujiang Wang, Maja Pantic
- Abstract summary: Face parsing aims to predict pixel-wise labels for facial components of a target face in an image.
Existing approaches usually crop the target face from the input image with respect to a bounding box calculated during pre-processing.
We propose RoI Tanh-polar transform that warps the whole image to a Tanh-polar representation with a fixed ratio between the face area and the context.
Third, we propose a hybrid residual representation learning block, coined HybridBlock, that contains convolutional layers in both the Tanh-polar space and the Tanh-Cartesian space.
- Score: 50.8865921538953
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Face parsing aims to predict pixel-wise labels for facial components of a
target face in an image. Existing approaches usually crop the target face from
the input image with respect to a bounding box calculated during
pre-processing, and thus can only parse inner facial Regions of Interest
(RoIs). Peripheral regions like hair are ignored and nearby faces that are
partially included in the bounding box can cause distractions. Moreover, these
methods are only trained and evaluated on near-frontal portrait images and thus
their performance for in-the-wild cases were unexplored. To address these
issues, this paper makes three contributions. First, we introduce iBugMask
dataset for face parsing in the wild containing 1,000 manually annotated images
with large variations in sizes, poses, expressions and background, and
Helen-LP, a large-pose training set containing 21,866 images generated using
head pose augmentation. Second, we propose RoI Tanh-polar transform that warps
the whole image to a Tanh-polar representation with a fixed ratio between the
face area and the context, guided by the target bounding box. The new
representation contains all information in the original image, and allows for
rotation equivariance in the convolutional neural networks (CNNs). Third, we
propose a hybrid residual representation learning block, coined HybridBlock,
that contains convolutional layers in both the Tanh-polar space and the
Tanh-Cartesian space, allowing for receptive fields of different shapes in
CNNs. Through extensive experiments, we show that the proposed method
significantly improves the state-of-the-art for face parsing in the wild.
Related papers
- Occlusion-Aware Deep Convolutional Neural Network via Homogeneous Tanh-transforms for Face Parsing [2.062767930320204]
Face parsing infers a pixel-wise label map for each semantic facial component.
We propose a novel homogeneous tanh-transform for image preprocessing, which is made up of four tanh-transforms.
Based on homogeneous tanh-transforms, we propose an occlusion-aware convolutional neural network for occluded face parsing.
arXiv Detail & Related papers (2023-08-29T14:20:13Z) - SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z) - StyO: Stylize Your Face in Only One-Shot [8.253458555695767]
This paper focuses on face stylization with a single artistic target.
Existing works for this task often fail to retain the source content while achieving geometry variation.
We present a novel StyO model, ie. Stylize the face in only One-shot, to solve the above problem.
arXiv Detail & Related papers (2023-03-06T15:48:33Z) - Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with
Conditional StyleGAN [88.62422914645066]
We present an algorithm for re-rendering a person from a single image under arbitrary poses.
Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image.
We show that our method compares favorably against the state-of-the-art algorithms in both quantitative evaluation and visual comparison.
arXiv Detail & Related papers (2021-09-13T17:59:33Z) - FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for
Blind Face Inpainting [77.78305705925376]
Blind face inpainting refers to the task of reconstructing visual contents without explicitly indicating the corrupted regions in a face image.
We propose a novel two-stage blind face inpainting method named Frequency-guided Transformer and Top-Down Refinement Network (FT-TDR) to tackle these challenges.
arXiv Detail & Related papers (2021-08-10T03:12:01Z) - Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis.
We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z) - Facial Manipulation Detection Based on the Color Distribution Analysis
in Edge Region [0.5735035463793008]
We present a generalized and robust facial manipulation detection method based on color distribution analysis of the vertical region of edge in a manipulated image.
Our extensive experiments show that our method outperforms other existing face manipulation detection methods on detecting synthesized face image in various datasets regardless of whether it has participated in training.
arXiv Detail & Related papers (2021-02-02T08:19:35Z) - CapsField: Light Field-based Face and Expression Recognition in the Wild
using Capsule Routing [81.21490913108835]
This paper proposes a new deep face and expression recognition solution, called CapsField, based on a convolutional neural network.
The proposed solution achieves superior performance for both face and expression recognition tasks when compared to the state-of-the-art.
arXiv Detail & Related papers (2021-01-10T09:06:02Z) - Edge-aware Graph Representation Learning and Reasoning for Face Parsing [61.5045850197694]
Face parsing infers a pixel-wise label to each facial component, which has drawn much attention recently.
Previous methods have shown their efficiency in face parsing, which however overlook the correlation among different face regions.
We propose to model and reason the region-wise relations by learning graph representations.
arXiv Detail & Related papers (2020-07-22T07:46:34Z) - Domain Embedded Multi-model Generative Adversarial Networks for
Image-based Face Inpainting [44.598234654270584]
We present a domain embedded multi-model generative adversarial model for inpainting of face images with large cropped regions.
Experiments on both CelebA and CelebA-HQ face datasets demonstrate that our proposed approach achieved state-of-the-art performance.
arXiv Detail & Related papers (2020-02-05T17:36:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.