Neural Photometry-guided Visual Attribute Transfer
- URL: http://arxiv.org/abs/2112.02520v1
- Date: Sun, 5 Dec 2021 09:22:28 GMT
- Title: Neural Photometry-guided Visual Attribute Transfer
- Authors: Carlos Rodriguez-Pardo and Elena Garces
- Abstract summary: We present a deep learning-based method for propagating visual material attributes to larger samples of the same or similar materials.
For training, we leverage images of the material taken under multiple illuminations and a dedicated data augmentation policy.
Our model relies on a supervised image-to-image translation framework and is agnostic to the transferred domain.
- Score: 4.630419389180576
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a deep learning-based method for propagating spatially-varying
visual material attributes (e.g. texture maps or image stylizations) to larger
samples of the same or similar materials. For training, we leverage images of
the material taken under multiple illuminations and a dedicated data
augmentation policy, making the transfer robust to novel illumination
conditions and affine deformations. Our model relies on a supervised
image-to-image translation framework and is agnostic to the transferred domain;
we showcase a semantic segmentation, a normal map, and a stylization. Following
an image analogies approach, the method only requires the training data to
contain the same visual structures as the input guidance. Our approach works at
interactive rates, making it suitable for material edit applications. We
thoroughly evaluate our learning methodology in a controlled setup providing
quantitative measures of performance. Last, we demonstrate that training the
model on a single material is enough to generalize to materials of the same
type without the need for massive datasets.
Related papers
- Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes.
Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps.
Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z) - Dense Text-to-Image Generation with Attention Modulation [49.287458275920514]
Existing text-to-image diffusion models struggle to synthesize realistic images given dense captions.
We propose DenseDiffusion, a training-free method that adapts a pre-trained text-to-image model to handle such dense captions.
We achieve similar-quality visual results with models specifically trained with layout conditions.
arXiv Detail & Related papers (2023-08-24T17:59:01Z) - Materialistic: Selecting Similar Materials in Images [30.85562156542794]
We present a method capable of selecting the regions of a photograph exhibiting the same material as an artist-chosen area.
Our proposed approach is robust to shading, specular highlights, and cast shadows, enabling selection in real images.
We demonstrate our model on a set of applications, including material editing, in-video selection, and retrieval of object photographs with similar materials.
arXiv Detail & Related papers (2023-05-22T17:50:48Z) - Few-shot Semantic Image Synthesis with Class Affinity Transfer [23.471210664024067]
We propose a transfer method that leverages a model trained on a large source dataset to improve the learning ability on small target datasets.
The class affinity matrix is introduced as a first layer to the source model to make it compatible with the target label maps.
We apply our approach to GAN-based and diffusion-based architectures for semantic synthesis.
arXiv Detail & Related papers (2023-04-05T09:24:45Z) - Neural Congealing: Aligning Images to a Joint Semantic Atlas [14.348512536556413]
We present a zero-shot self-supervised framework for aligning semantically-common content across a set of images.
Our approach harnesses the power of pre-trained DINO-ViT features to learn.
We show that our method performs favorably compared to a state-of-the-art method that requires extensive training on large-scale datasets.
arXiv Detail & Related papers (2023-02-08T09:26:22Z) - Photo-to-Shape Material Transfer for Diverse Structures [15.816608726698986]
We introduce a method for assigning photorealistic relightable materials to 3D shapes in an automatic manner.
Our method combines an image translation neural network with a material assignment neural network.
We demonstrate that our method allows us to assign materials to shapes so that their appearances better resemble the input exemplars.
arXiv Detail & Related papers (2022-05-09T03:37:01Z) - Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis [68.1281982092765]
We propose a novel normalization module, termed as REtrieval-based Spatially AdaptIve normaLization (RESAIL)
RESAIL provides pixel level fine-grained guidance to the normalization architecture.
Experiments on several challenging datasets show that our RESAIL performs favorably against state-of-the-arts in terms of quantitative metrics, visual quality, and subjective evaluation.
arXiv Detail & Related papers (2022-04-06T14:21:39Z) - Multimodal Contrastive Training for Visual Representation Learning [45.94662252627284]
We develop an approach to learning visual representations that embraces multimodal data.
Our method exploits intrinsic data properties within each modality and semantic information from cross-modal correlation simultaneously.
By including multimodal training in a unified framework, our method can learn more powerful and generic visual features.
arXiv Detail & Related papers (2021-04-26T19:23:36Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.