Materialistic: Selecting Similar Materials in Images
- URL: http://arxiv.org/abs/2305.13291v1
- Date: Mon, 22 May 2023 17:50:48 GMT
- Title: Materialistic: Selecting Similar Materials in Images
- Authors: Prafull Sharma, Julien Philip, Micha\"el Gharbi, William T. Freeman,
Fredo Durand, Valentin Deschaintre
- Abstract summary: We present a method capable of selecting the regions of a photograph exhibiting the same material as an artist-chosen area.
Our proposed approach is robust to shading, specular highlights, and cast shadows, enabling selection in real images.
We demonstrate our model on a set of applications, including material editing, in-video selection, and retrieval of object photographs with similar materials.
- Score: 30.85562156542794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Separating an image into meaningful underlying components is a crucial first
step for both editing and understanding images. We present a method capable of
selecting the regions of a photograph exhibiting the same material as an
artist-chosen area. Our proposed approach is robust to shading, specular
highlights, and cast shadows, enabling selection in real images. As we do not
rely on semantic segmentation (different woods or metal should not be selected
together), we formulate the problem as a similarity-based grouping problem
based on a user-provided image location. In particular, we propose to leverage
the unsupervised DINO features coupled with a proposed Cross-Similarity module
and an MLP head to extract material similarities in an image. We train our
model on a new synthetic image dataset, that we release. We show that our
method generalizes well to real-world images. We carefully analyze our model's
behavior on varying material properties and lighting. Additionally, we evaluate
it against a hand-annotated benchmark of 50 real photographs. We further
demonstrate our model on a set of applications, including material editing,
in-video selection, and retrieval of object photographs with similar materials.
Related papers
- Zero-shot Image Editing with Reference Imitation [50.75310094611476]
We present a new form of editing, termed imitative editing, to help users exercise their creativity more conveniently.
We propose a generative training framework, dubbed MimicBrush, which randomly selects two frames from a video clip, masks some regions of one frame, and learns to recover the masked regions using the information from the other frame.
We experimentally show the effectiveness of our method under various test cases as well as its superiority over existing alternatives.
arXiv Detail & Related papers (2024-06-11T17:59:51Z) - Diffusion Model-Based Image Editing: A Survey [46.244266782108234]
Denoising diffusion models have emerged as a powerful tool for various image generation and editing tasks.
We provide an exhaustive overview of existing methods using diffusion models for image editing.
To further evaluate the performance of text-guided image editing algorithms, we propose a systematic benchmark, EditEval.
arXiv Detail & Related papers (2024-02-27T14:07:09Z) - Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes.
Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps.
Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z) - Decoupled Textual Embeddings for Customized Image Generation [62.98933630971543]
Customized text-to-image generation aims to learn user-specified concepts with a few images.
Existing methods usually suffer from overfitting issues and entangle the subject-unrelated information with the learned concept.
We propose the DETEX, a novel approach that learns the disentangled concept embedding for flexible customized text-to-image generation.
arXiv Detail & Related papers (2023-12-19T03:32:10Z) - Neural Congealing: Aligning Images to a Joint Semantic Atlas [14.348512536556413]
We present a zero-shot self-supervised framework for aligning semantically-common content across a set of images.
Our approach harnesses the power of pre-trained DINO-ViT features to learn.
We show that our method performs favorably compared to a state-of-the-art method that requires extensive training on large-scale datasets.
arXiv Detail & Related papers (2023-02-08T09:26:22Z) - Photo-to-Shape Material Transfer for Diverse Structures [15.816608726698986]
We introduce a method for assigning photorealistic relightable materials to 3D shapes in an automatic manner.
Our method combines an image translation neural network with a material assignment neural network.
We demonstrate that our method allows us to assign materials to shapes so that their appearances better resemble the input exemplars.
arXiv Detail & Related papers (2022-05-09T03:37:01Z) - Neural Photometry-guided Visual Attribute Transfer [4.630419389180576]
We present a deep learning-based method for propagating visual material attributes to larger samples of the same or similar materials.
For training, we leverage images of the material taken under multiple illuminations and a dedicated data augmentation policy.
Our model relies on a supervised image-to-image translation framework and is agnostic to the transferred domain.
arXiv Detail & Related papers (2021-12-05T09:22:28Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z) - RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval [76.87013602243053]
We propose a differentiable retrieval module to synthesize images from scene description with retrieved patches as reference.
We conduct extensive quantitative and qualitative experiments to demonstrate that the proposed method can generate realistic and diverse images.
arXiv Detail & Related papers (2020-07-16T17:59:04Z) - Region-adaptive Texture Enhancement for Detailed Person Image Synthesis [86.69934638569815]
RATE-Net is a novel framework for synthesizing person images with sharp texture details.
The proposed framework leverages an additional texture enhancing module to extract appearance information from the source image.
Experiments conducted on DeepFashion benchmark dataset have demonstrated the superiority of our framework compared with existing networks.
arXiv Detail & Related papers (2020-05-26T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.