Parameter Efficient Local Implicit Image Function Network for Face
Segmentation
- URL: http://arxiv.org/abs/2303.15122v1
- Date: Mon, 27 Mar 2023 11:50:27 GMT
- Title: Parameter Efficient Local Implicit Image Function Network for Face
Segmentation
- Authors: Mausoom Sarkar, Nikitha SR, Mayur Hemani, Rishabh Jain, Balaji
Krishnamurthy
- Abstract summary: Face parsing is defined as the per-pixel labeling of images containing human faces.
We make use of the structural consistency of the human face to propose a lightweight face-parsing method.
- Score: 13.124513975412254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Face parsing is defined as the per-pixel labeling of images containing human
faces. The labels are defined to identify key facial regions like eyes, lips,
nose, hair, etc. In this work, we make use of the structural consistency of the
human face to propose a lightweight face-parsing method using a Local Implicit
Function network, FP-LIIF. We propose a simple architecture having a
convolutional encoder and a pixel MLP decoder that uses 1/26th number of
parameters compared to the state-of-the-art models and yet matches or
outperforms state-of-the-art models on multiple datasets, like CelebAMask-HQ
and LaPa. We do not use any pretraining, and compared to other works, our
network can also generate segmentation at different resolutions without any
changes in the input resolution. This work enables the use of facial
segmentation on low-compute or low-bandwidth devices because of its higher FPS
and smaller model size.
Related papers
- Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP)
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid.
PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z) - KeyPoint Relative Position Encoding for Face Recognition [15.65725865703615]
Keypoint RPE (KP-RPE) is an extension of the principle where significance of pixels is not solely dictated by their proximity.
Code and pre-trained models are available.
arXiv Detail & Related papers (2024-03-21T21:56:09Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [63.54342601757723]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Adaptive Local Implicit Image Function for Arbitrary-scale
Super-resolution [61.95533972380704]
Local implicit image function (LIIF) denotes images as a continuous function where pixel values are expansion by using the corresponding coordinates as inputs.
LIIF can be adopted for arbitrary-scale image super-resolution tasks, resulting in a single effective and efficient model for various up-scaling factors.
We propose a novel adaptive local image function (A-LIIF) to alleviate this problem.
arXiv Detail & Related papers (2022-08-07T11:23:23Z) - Evidential fully convolutional network for semantic segmentation [6.230751621285322]
We propose a hybrid architecture composed of a fully convolutional network (FCN) and a Dempster-Shafer layer for image semantic segmentation.
Experiments show that the proposed combination improves the accuracy and calibration of semantic segmentation by assigning confusing pixels to multi-class sets.
arXiv Detail & Related papers (2021-03-25T01:21:22Z) - Learning Spatial Attention for Face Super-Resolution [28.60619685892613]
General image super-resolution techniques have difficulties in recovering detailed face structures when applying to low resolution face images.
Recent deep learning based methods tailored for face images have achieved improved performance by jointly trained with additional task such as face parsing and landmark prediction.
We introduce a novel SPatial Attention Residual Network (SPARNet) built on our newly proposed Face Attention Units (FAUs) for face super-resolution.
arXiv Detail & Related papers (2020-12-02T13:54:25Z) - Progressive Semantic-Aware Style Transformation for Blind Face
Restoration [26.66332852514812]
We propose a new progressive semantic-aware style transformation framework, named PSFR-GAN, for face restoration.
The proposed PSFR-GAN makes full use of the semantic (parsing maps) and pixel (LQ images) space information from different scales of input pairs.
Experiment results show that our model trained with synthetic data can not only produce more realistic high-resolution results for synthetic LQ inputs but also better to generalize natural LQ face images.
arXiv Detail & Related papers (2020-09-18T09:27:33Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - DotFAN: A Domain-transferred Face Augmentation Network for Pose and
Illumination Invariant Face Recognition [94.96686189033869]
We propose a 3D model-assisted domain-transferred face augmentation network (DotFAN)
DotFAN can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains.
Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity.
arXiv Detail & Related papers (2020-02-23T08:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.