Related papers: Cross-modal Deep Face Normals with Deactivable Skip Connections

Cross-modal Deep Face Normals with Deactivable Skip Connections

URL: http://arxiv.org/abs/2003.09691v2
Date: Mon, 30 Mar 2020 13:54:14 GMT
Title: Cross-modal Deep Face Normals with Deactivable Skip Connections
Authors: Victoria Fernandez Abrevaya, Adnane Boukhayma, Philip H. S. Torr, Edmond Boyer
Abstract summary: We present an approach for estimating surface normals from in-the-wild color images of faces. We propose a method that can leverage all available image and normal data, whether paired or not. We show that our approach can achieve significant improvements, both quantitative and qualitative, with natural face images.
Score: 77.83961745760216
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an approach for estimating surface normals from in-the-wild color images of faces. While data-driven strategies have been proposed for single face images, limited available ground truth data makes this problem difficult. To alleviate this issue, we propose a method that can leverage all available image and normal data, whether paired or not, thanks to a novel cross-modal learning architecture. In particular, we enable additional training with single modality data, either color or normal, by using two encoder-decoder networks with a shared latent space. The proposed architecture also enables face details to be transferred between the image and normal domains, given paired data, through skip connections between the image encoder and normal decoder. Core to our approach is a novel module that we call deactivable skip connections, which allows integrating both the auto-encoded and image-to-normal branches within the same architecture that can be trained end-to-end. This allows learning of a rich latent space that can accurately capture the normal information. We compare against state-of-the-art methods and show that our approach can achieve significant improvements, both quantitative and qualitative, with natural face images.

Related papers

MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training [62.843316348659165]
Deep learning-based image matching algorithms have dramatically outperformed humans in rapidly and accurately finding large amounts of correspondences. We propose a large-scale pre-training framework that utilizes synthetic cross-modal training signals to train models to recognize and match fundamental structures across images. Our key finding is that the matching model trained with our framework achieves remarkable generalizability across more than eight unseen cross-modality registration tasks.
arXiv Detail & Related papers (2025-01-13T18:37:36Z)
PHNet: Patch-based Normalization for Portrait Harmonization [41.94295877935867]
A common problem for composite images is the incompatibility of their foreground and background components. We present a patch-based harmonization network consisting of novel Patch-based normalization blocks and a feature extractor. Our network achieves state-of-the-art results on the iHarmony4 dataset.
arXiv Detail & Related papers (2024-02-27T14:59:48Z)
Content-aware Warping for View Synthesis [110.54435867693203]
We propose content-aware warping, which adaptively learns the weights for pixels of a relatively large neighborhood from their contextual information via a lightweight neural network. Based on this learnable warping module, we propose a new end-to-end learning-based framework for novel view synthesis from two source views. Experimental results on structured light field datasets with wide baselines and unstructured multi-view datasets show that the proposed method significantly outperforms state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2022-01-22T11:35:05Z)
Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization [59.19214040221055]
We propose a novel spatial-separated curve rendering network (S$2$CRNet) for efficient and high-resolution image harmonization. The proposed method reduces more than 90% parameters compared with previous methods. Our method can work smoothly on higher resolution images in real-time which is more than 10$times$ faster than the existing methods.
arXiv Detail & Related papers (2021-09-13T07:20:16Z)
Collaboration among Image and Object Level Features for Image Colourisation [25.60139324272782]
Image colourisation is an ill-posed problem, with multiple correct solutions which depend on the context and object instances present in the input datum. Previous approaches attacked the problem either by requiring intense user interactions or by exploiting the ability of convolutional neural networks (CNNs) in learning image level (context) features. We propose a single network, named UCapsNet, that separate image-level features obtained through convolutions and object-level features captured by means of capsules. Then, by skip connections over different layers, we enforce collaboration between such disentangling factors to produce high quality and plausible image colourisation.
arXiv Detail & Related papers (2021-01-19T11:48:12Z)
Deep Image Compositing [93.75358242750752]
We propose a new method which can automatically generate high-quality image composites without any user input. Inspired by Laplacian pyramid blending, a dense-connected multi-stream fusion network is proposed to effectively fuse the information from the foreground and background images. Experiments show that the proposed method can automatically generate high-quality composites and outperforms existing methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-11-04T06:12:24Z)
Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting. We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders. Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z)
Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns. It is difficult for encoders to capture such powerful representations under this complex situation. We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z)
Realistic Image Normalization for Multi-Domain Segmentation [7.856339385917824]
This paper revisits the conventional image normalization approach by instead learning a common normalizing function across multiple datasets. Jointly normalizing multiple datasets is shown to yield consistent normalized images as well as an improved image segmentation. Our method can also enhance data availability by increasing the number of samples available when learning from multiple imaging domains.
arXiv Detail & Related papers (2020-09-29T13:57:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.