Related papers: Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization

Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization

URL: http://arxiv.org/abs/2504.03011v1
Date: Thu, 03 Apr 2025 20:10:50 GMT
Title: Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
Authors: Junying Wang, Jingyuan Liu, Xin Sun, Krishna Kumar Singh, Zhixin Shu, He Zhang, Jimei Yang, Nanxuan Zhao, Tuanfeng Y. Wang, Simon S. Chen, Ulrich Neumann, Jae Shin Yoon,
Abstract summary: Comprehensive Relighting is the first all-in-one approach that can both control and harmonize the lighting from an image or video of humans with arbitrary body parts from any scene.<n>In the experiments, Comprehensive Relighting shows a strong generalizability and lighting temporal coherence, outperforming existing image-based human relighting and harmonization methods.
Score: 43.02033340663918
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces Comprehensive Relighting, the first all-in-one approach that can both control and harmonize the lighting from an image or video of humans with arbitrary body parts from any scene. Building such a generalizable model is extremely challenging due to the lack of dataset, restricting existing image-based relighting models to a specific scenario (e.g., face or static human). To address this challenge, we repurpose a pre-trained diffusion model as a general image prior and jointly model the human relighting and background harmonization in the coarse-to-fine framework. To further enhance the temporal coherence of the relighting, we introduce an unsupervised temporal lighting model that learns the lighting cycle consistency from many real-world videos without any ground truth. In inference time, our temporal lighting module is combined with the diffusion models through the spatio-temporal feature blending algorithms without extra training; and we apply a new guided refinement as a post-processing to preserve the high-frequency details from the input image. In the experiments, Comprehensive Relighting shows a strong generalizability and lighting temporal coherence, outperforming existing image-based human relighting and harmonization methods.

Related papers

UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting [85.27994475113056]
We introduce a general-purpose approach that jointly estimates albedo and synthesizes relit outputs in a single pass.<n>Our model demonstrates strong generalization across diverse domains and surpasses previous methods in both visual fidelity and temporal consistency.
arXiv Detail & Related papers (2025-06-18T17:56:45Z)
GroomLight: Hybrid Inverse Rendering for Relightable Human Hair Appearance Modeling [56.94251484447597]
GroomLight is a novel method for relightable hair appearance modeling from multi-view images.<n>We propose a hybrid inverse rendering pipeline to optimize both components, enabling high-fidelity relighting, view synthesis, and material editing.
arXiv Detail & Related papers (2025-03-13T17:43:12Z)
DifFRelight: Diffusion-Based Facial Performance Relighting [12.909429637057343]
We present a novel framework for free-viewpoint facial performance relighting using diffusion-based image-to-image translation. We train a diffusion model for precise lighting control, enabling high-fidelity relit facial images from flat-lit inputs. The model accurately reproduces complex lighting effects like eye reflections, subsurface scattering, self-shadowing, and translucency.
arXiv Detail & Related papers (2024-10-10T17:56:44Z)
URHand: Universal Relightable Hands [64.25893653236912]
We present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows few-shot personalization using images captured with a mobile phone, and is ready to be photorealistically rendered under novel illuminations.
arXiv Detail & Related papers (2024-01-10T18:59:51Z)
Relightful Harmonization: Lighting-aware Portrait Background Replacement [23.19641174787912]
We introduce Relightful Harmonization, a lighting-aware diffusion model designed to seamlessly harmonize sophisticated lighting effect for the foreground portrait using any background image. Our approach unfolds in three stages. First, we introduce a lighting representation module that allows our diffusion model to encode lighting information from target image background. Second, we introduce an alignment network that aligns lighting features learned from image background with lighting features learned from panorama environment maps.
arXiv Detail & Related papers (2023-12-11T23:20:31Z)
TensoIR: Tensorial Inverse Rendering [51.57268311847087]
TensoIR is a novel inverse rendering approach based on tensor factorization and neural fields. TensoRF is a state-of-the-art approach for radiance field modeling.
arXiv Detail & Related papers (2023-04-24T21:39:13Z)
Single-image Full-body Human Relighting [42.06323641073984]
We present a single-image data-driven method to automatically relight images with full-body humans in them. Our framework is based on a realistic scene decomposition leveraging precomputed radiance transfer (PRT) and spherical harmonics (SH) lighting. We propose a new deep learning architecture, tailored to the decomposition performed in PRT, that is trained using a combination of L1, logarithmic, and rendering losses.
arXiv Detail & Related papers (2021-07-15T11:34:03Z)
Neural Video Portrait Relighting in Real-time via Consistency Modeling [41.04622998356025]
We propose a neural approach for real-time, high-quality and coherent video portrait relighting. We propose a hybrid structure and lighting disentanglement in an encoder-decoder architecture. We also propose a lighting sampling strategy to model the illumination consistency and mutation for natural portrait light manipulation in real-world.
arXiv Detail & Related papers (2021-04-01T14:13:28Z)
Light Stage Super-Resolution: Continuous High-Frequency Relighting [58.09243542908402]
We propose a learning-based solution for the "super-resolution" of scans of human faces taken from a light stage. Our method aggregates the captured images corresponding to neighboring lights in the stage, and uses a neural network to synthesize a rendering of the face. Our learned model is able to produce renderings for arbitrary light directions that exhibit realistic shadows and specular highlights.
arXiv Detail & Related papers (2020-10-17T23:40:43Z)
Crowdsampling the Plenoptic Function [56.10020793913216]
We present a new approach to novel view synthesis under time-varying illumination from such data. We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function. Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time.
arXiv Detail & Related papers (2020-07-30T02:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.