Learning Pixel-Adaptive Weights for Portrait Photo Retouching
- URL: http://arxiv.org/abs/2112.03536v1
- Date: Tue, 7 Dec 2021 07:23:42 GMT
- Title: Learning Pixel-Adaptive Weights for Portrait Photo Retouching
- Authors: Binglu Wang, Chengzhe Lu, Dawei Yan, Yongqiang Zhao
- Abstract summary: Portrait photo retouching is a photo retouching task that emphasizes human-region priority and group-level consistency.
In this paper, we model local context cues to improve the retouching quality explicitly.
Experiments on PPR10K dataset verify the effectiveness of our method.
- Score: 1.9843222704723809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Portrait photo retouching is a photo retouching task that emphasizes
human-region priority and group-level consistency. The lookup table-based
method achieves promising retouching performance by learning image-adaptive
weights to combine 3-dimensional lookup tables (3D LUTs) and conducting
pixel-to-pixel color transformation. However, this paradigm ignores the local
context cues and applies the same transformation to portrait pixels and
background pixels when they exhibit the same raw RGB values. In contrast, an
expert usually conducts different operations to adjust the color temperatures
and tones of portrait regions and background regions. This inspires us to model
local context cues to improve the retouching quality explicitly. Firstly, we
consider an image patch and predict pixel-adaptive lookup table weights to
precisely retouch the center pixel. Secondly, as neighboring pixels exhibit
different affinities to the center pixel, we estimate a local attention mask to
modulate the influence of neighboring pixels. Thirdly, the quality of the local
attention mask can be further improved by applying supervision, which is based
on the affinity map calculated by the groundtruth portrait mask. As for
group-level consistency, we propose to directly constrain the variance of mean
color components in the Lab space. Extensive experiments on PPR10K dataset
verify the effectiveness of our method, e.g. on high-resolution photos, the
PSNR metric receives over 0.5 gains while the group-level consistency metric
obtains at least 2.1 decreases.
Related papers
- Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization [4.8454936010479335]
We propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization.
Specifically, we first pre-train the backbone network with the supervised contrastive loss.
Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer.
arXiv Detail & Related papers (2024-06-19T13:51:52Z) - KeyPoint Relative Position Encoding for Face Recognition [15.65725865703615]
Keypoint RPE (KP-RPE) is an extension of the principle where significance of pixels is not solely dictated by their proximity.
Code and pre-trained models are available.
arXiv Detail & Related papers (2024-03-21T21:56:09Z) - Differentiable Registration of Images and LiDAR Point Clouds with
VoxelPoint-to-Pixel Matching [58.10418136917358]
Cross-modality registration between 2D images from cameras and 3D point clouds from LiDARs is a crucial task in computer vision and robotic training.
Previous methods estimate 2D-3D correspondences by matching point and pixel patterns learned by neural networks.
We learn a structured cross-modality matching solver to represent 3D features via a different latent pixel space.
arXiv Detail & Related papers (2023-12-07T05:46:10Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Quantity-Aware Coarse-to-Fine Correspondence for Image-to-Point Cloud
Registration [4.954184310509112]
Image-to-point cloud registration aims to determine the relative camera pose between an RGB image and a reference point cloud.
Matching individual points with pixels can be inherently ambiguous due to modality gaps.
We propose a framework to capture quantity-aware correspondences between local point sets and pixel patches.
arXiv Detail & Related papers (2023-07-14T03:55:54Z) - Improving Pixel-Level Contrastive Learning by Leveraging Exogenous Depth
Information [7.561849435043042]
Self-supervised representation learning based on Contrastive Learning (CL) has been the subject of much attention in recent years.
In this paper we will focus on the depth information, which can be obtained by using a depth network or measured from available data.
We show that using this estimation information in the contrastive loss leads to improved results and that the learned representations better follow the shapes of objects.
arXiv Detail & Related papers (2022-11-18T11:45:39Z) - Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z) - RoI Tanh-polar Transformer Network for Face Parsing in the Wild [50.8865921538953]
Face parsing aims to predict pixel-wise labels for facial components of a target face in an image.
Existing approaches usually crop the target face from the input image with respect to a bounding box calculated during pre-processing.
We propose RoI Tanh-polar transform that warps the whole image to a Tanh-polar representation with a fixed ratio between the face area and the context.
Third, we propose a hybrid residual representation learning block, coined HybridBlock, that contains convolutional layers in both the Tanh-polar space and the Tanh-Cartesian space.
arXiv Detail & Related papers (2021-02-04T16:25:26Z) - Geometric Correspondence Fields: Learned Differentiable Rendering for 3D
Pose Refinement in the Wild [96.09941542587865]
We present a novel 3D pose refinement approach based on differentiable rendering for objects of arbitrary categories in the wild.
In this way, we precisely align 3D models to objects in RGB images which results in significantly improved 3D pose estimates.
We evaluate our approach on the challenging Pix3D dataset and achieve up to 55% relative improvement compared to state-of-the-art refinement methods in multiple metrics.
arXiv Detail & Related papers (2020-07-17T12:34:38Z) - JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network
for 3D Hand Pose Estimation from a Single Depth Image [28.753759115780515]
State-of-the-art single depth image-based 3D hand pose estimation methods are based on dense predictions.
A novel pixel-wise prediction-based method is proposed to address the above issues.
The proposed model is implemented with an efficient 2D fully convolutional network backbone and has only about 1.4M parameters.
arXiv Detail & Related papers (2020-07-09T08:57:19Z) - Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics [60.92229707497999]
We introduce a novel principle for self-supervised feature learning based on the discrimination of specific transformations of an image.
We demonstrate experimentally that learning to discriminate transformations such as LCI, image warping and rotations, yields features with state of the art generalization capabilities.
arXiv Detail & Related papers (2020-04-05T22:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.