Related papers: SAIL: Self-supervised Albedo Estimation from Real Images with a Latent Diffusion Model

SAIL: Self-supervised Albedo Estimation from Real Images with a Latent Diffusion Model

URL: http://arxiv.org/abs/2505.19751v2
Date: Tue, 27 May 2025 09:27:25 GMT
Title: SAIL: Self-supervised Albedo Estimation from Real Images with a Latent Diffusion Model
Authors: Hala Djeghim, Nathan Piasco, Luis Roldão, Moussab Bennehar, Dzmitry Tsishkou, Céline Loscos, Désiré Sidibé,
Abstract summary: Intrinsic image decomposition aims at separating an image into its underlying albedo and shading components.<n>We propose SAIL, an approach designed to estimate albedo-like representations from single-view real-world images.
Score: 4.015354837450373
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Intrinsic image decomposition aims at separating an image into its underlying albedo and shading components, isolating the base color from lighting effects to enable downstream applications such as virtual relighting and scene editing. Despite the rise and success of learning-based approaches, intrinsic image decomposition from real-world images remains a significant challenging task due to the scarcity of labeled ground-truth data. Most existing solutions rely on synthetic data as supervised setups, limiting their ability to generalize to real-world scenes. Self-supervised methods, on the other hand, often produce albedo maps that contain reflections and lack consistency under different lighting conditions. To address this, we propose SAIL, an approach designed to estimate albedo-like representations from single-view real-world images. We repurpose the prior knowledge of a latent diffusion model for unconditioned scene relighting as a surrogate objective for albedo estimation. To extract the albedo, we introduce a novel intrinsic image decomposition fully formulated in the latent space. To guide the training of our latent diffusion model, we introduce regularization terms that constrain both the lighting-dependent and independent components of our latent image decomposition. SAIL predicts stable albedo under varying lighting conditions and generalizes to multiple scenes, using only unlabeled multi-illumination data available online.

Related papers

Visual-Instructed Degradation Diffusion for All-in-One Image Restoration [29.910376294021052]
We propose textbfDefusion, a novel all-in-one image restoration framework that utilizes visual instruction-guided degradation diffusion.<n>We show that Defusion outperforms state-of-the-art methods across diverse image restoration tasks, including complex and real-world degradations.
arXiv Detail & Related papers (2025-06-20T12:50:42Z)
Colorful Diffuse Intrinsic Image Decomposition in the Wild [0.0]
Intrinsic image decomposition aims to separate the surface reflectance and the effects from the illumination given a single photograph. In this work, we separate an input image into its diffuse albedo, colorful diffuse shading, and specular residual components. Our extended intrinsic model enables illumination-aware analysis of photographs and can be used for image editing applications.
arXiv Detail & Related papers (2024-09-20T17:59:40Z)
A General Albedo Recovery Approach for Aerial Photogrammetric Images through Inverse Rendering [7.874736360019618]
This paper presents a general image formation model for albedo recovery from typical aerial photogrammetric images under natural illuminations. Our approach builds on the fact that both the sun illumination and scene geometry are estimable in aerial photogrammetry.
arXiv Detail & Related papers (2024-09-04T18:58:32Z)
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering [56.68286440268329]
correct insertion of virtual objects in images of real-world scenes requires a deep understanding of the scene's lighting, geometry and materials. We propose using a personalized large diffusion model as guidance to a physically based inverse rendering process. Our method recovers scene lighting and tone-mapping parameters, allowing the photorealistic composition of arbitrary virtual objects in single frames or videos of indoor or outdoor scenes.
arXiv Detail & Related papers (2024-08-19T05:15:45Z)
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models [39.28266945709169]
We propose a diffusion-based unsupervised framework that incorporates physically explainable Retinex theory with diffusion models for low-light image enhancement. Experiments on publicly available real-world benchmarks show that the proposed LightenDiffusion outperforms state-of-the-art unsupervised competitors.
arXiv Detail & Related papers (2024-07-12T02:54:43Z)
Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes. Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps. Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z)
Diffusion Posterior Illumination for Ambiguity-aware Inverse Rendering [63.24476194987721]
Inverse rendering, the process of inferring scene properties from images, is a challenging inverse problem. Most existing solutions incorporate priors into the inverse-rendering pipeline to encourage plausible solutions. We propose a novel scheme that integrates a denoising probabilistic diffusion model pre-trained on natural illumination maps into an optimization framework.
arXiv Detail & Related papers (2023-09-30T12:39:28Z)
A Novel Intrinsic Image Decomposition Method to Recover Albedo for Aerial Images in Photogrammetry Processing [3.556015072520384]
Surface albedos from photogrammetric images can facilitate its downstream applications in VR/AR/MR and digital twins. Standard photogrammetric pipelines are suboptimal to these applications because these textures are directly derived from images. We propose an image formation model with respect to outdoor aerial imagery under natural illumination conditions. We then, derived the inverse model to estimate the albedo by utilizing the typical photogrammetric products as an initial approximation of the geometry.
arXiv Detail & Related papers (2022-04-08T15:50:52Z)
DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer [78.91753256634453]
We consider the challenging problem of predicting intrinsic object properties from a single image by exploiting differentiables. In this work, we propose DIBR++, a hybrid differentiable which supports these effects by combining specularization and ray-tracing. Compared to more advanced physics-based differentiables, DIBR++ is highly performant due to its compact and expressive model.
arXiv Detail & Related papers (2021-10-30T01:59:39Z)
Single-image Full-body Human Relighting [42.06323641073984]
We present a single-image data-driven method to automatically relight images with full-body humans in them. Our framework is based on a realistic scene decomposition leveraging precomputed radiance transfer (PRT) and spherical harmonics (SH) lighting. We propose a new deep learning architecture, tailored to the decomposition performed in PRT, that is trained using a combination of L1, logarithmic, and rendering losses.
arXiv Detail & Related papers (2021-07-15T11:34:03Z)
De-rendering the World's Revolutionary Artefacts [65.60220069214591]
We propose a method that can recover environment illumination and surface materials from real single-image collections. We focus on rotationally symmetric artefacts that exhibit challenging surface properties including reflections, such as vases. We present an end-to-end learning framework that is able to de-render the world's revolutionary artefacts.
arXiv Detail & Related papers (2021-04-08T17:56:16Z)
Non-Homogeneous Haze Removal via Artificial Scene Prior and Bidimensional Graph Reasoning [52.07698484363237]
We propose a Non-Homogeneous Haze Removal Network (NHRN) via artificial scene prior and bidimensional graph reasoning. Our method achieves superior performance over many state-of-the-art algorithms for both the single image dehazing and hazy image understanding tasks.
arXiv Detail & Related papers (2021-04-05T13:04:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.