Related papers: Light of Normals: Unified Feature Representation for Universal Photometric Stereo

Light of Normals: Unified Feature Representation for Universal Photometric Stereo

URL: http://arxiv.org/abs/2506.18882v4
Date: Sat, 04 Oct 2025 15:23:33 GMT
Title: Light of Normals: Unified Feature Representation for Universal Photometric Stereo
Authors: Hong Li, Houyuan Chen, Chongjie Ye, Zhaoxi Chen, Bohan Li, Shaocong Xu, Xianda Guo, Xuhui Liu, Yikai Wang, Baochang Zhang, Satoshi Ikehata, Boxin Shi, Anyi Rao, Hao Zhao,
Abstract summary: Current encoders cannot guarantee that illumination and normal information are decoupled.<n>We introduce LINO UniPS with two key components: (i) Light Register Tokens with light alignment supervision to aggregate point, direction, and environment lights.<n>We also introduce PS-Verse, a large-scale synthetic dataset graded by geometric complexity and lighting diversity.
Score: 69.95514862547174
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Universal photometric stereo (PS) is defined by two factors: it must (i) operate under arbitrary, unknown lighting conditions and (ii) avoid reliance on specific illumination models. Despite progress (e.g., SDM UniPS), two challenges remain. First, current encoders cannot guarantee that illumination and normal information are decoupled. To enforce decoupling, we introduce LINO UniPS with two key components: (i) Light Register Tokens with light alignment supervision to aggregate point, direction, and environment lights; (ii) Interleaved Attention Block featuring global cross-image attention that takes all lighting conditions together so the encoder can factor out lighting while retaining normal-related evidence. Second, high-frequency geometric details are easily lost. We address this with (i) a Wavelet-based Dual-branch Architecture and (ii) a Normal-gradient Perception Loss. These techniques yield a unified feature space in which lighting is explicitly represented by register tokens, while normal details are preserved via wavelet branch. We further introduce PS-Verse, a large-scale synthetic dataset graded by geometric complexity and lighting diversity, and adopt curriculum training from simple to complex scenes. Extensive experiments show new state-of-the-art results on public benchmarks (e.g., DiLiGenT, Luces), stronger generalization to real materials, and improved efficiency; ablations confirm that Light Register Tokens + Interleaved Attention Block drive better feature decoupling, while Wavelet-based Dual-branch Architecture + Normal-gradient Perception Loss recover finer details.

Related papers

Wavelet-based Decoupling Framework for low-light Stereo Image Enhancement [4.645709745803925]
This paper proposes a wavelet-based low-light stereo image enhancement method with feature space decoupling.<n>By using wavelet transform the feature space is decomposed into a low-frequency branch for illumination adjustment and multiple high-frequency branches for texture enhancement.<n>To enhance the high-frequency information, a detail and texture enhancement module (DTEM) is proposed based on cross-attention mechanism.
arXiv Detail & Related papers (2025-07-16T12:42:27Z)
UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting [85.27994475113056]
We introduce a general-purpose approach that jointly estimates albedo and synthesizes relit outputs in a single pass.<n>Our model demonstrates strong generalization across diverse domains and surpasses previous methods in both visual fidelity and temporal consistency.
arXiv Detail & Related papers (2025-06-18T17:56:45Z)
Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design [13.587511215001115]
Current Low-light Image Enhancement (LLIE) techniques rely on either direct Low-Light (LL) to Normal-Light (NL) mappings or guidance from semantic features or illumination maps.<n>We present SG-LLIE, a new multi-scale CNN-Transformer hybrid framework guided by structure priors.<n>Our solution ranks second in the NTIRE 2025 Low-Light Enhancement Challenge.
arXiv Detail & Related papers (2025-04-18T20:57:16Z)
Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment [46.60106452798745]
We introduce Luminance-GS, a novel approach to achieving high-quality novel view synthesis results under challenging lighting conditions using 3DGS.<n>By adopting per-view color matrix mapping and view-adaptive curve adjustments, Luminance-GS achieves state-of-the-art (SOTA) results across various lighting conditions.<n>Compared to previous NeRF- and 3DGS-based baselines, Luminance-GS provides real-time rendering speed with improved reconstruction quality.
arXiv Detail & Related papers (2025-04-02T08:54:57Z)
FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [92.4205087439928]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose the Self-supervised Transfer (PST) and the FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models, effectively mitigating data scarcity.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.<n>This combined approach enables FUSE to construct a universal image-event that only requires lightweight decoder adaptation for target datasets.
arXiv Detail & Related papers (2025-03-25T15:04:53Z)
DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains [0.0]
Low-light image enhancement (LLE) aims to improve the visual quality of images captured in poorly lit conditions.<n>These issues hinder the performance of computer vision tasks such as object detection, facial recognition, and autonomous driving.<n>We propose the Dual Light Enhance Network (DLEN), a novel architecture that incorporates two distinct attention mechanisms.
arXiv Detail & Related papers (2025-01-21T15:58:16Z)
Image Gradient-Aided Photometric Stereo Network [37.71540892622098]
Photometric stereo endeavors to ascertain surface normals using shading clues from photometric images under various illuminations.<n>Recent deep learning-based PS methods often overlook the complexity of object surfaces.<n>We propose the Image Gradient-Aided Photometric Stereo Network (IGA-PSN), a dual-branch framework extracting features from both photometric images and their gradients.
arXiv Detail & Related papers (2024-12-16T10:50:52Z)
Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions [58.88917836512819]
We propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints. To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking. Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset.
arXiv Detail & Related papers (2024-11-06T03:30:46Z)
Generalizable Non-Line-of-Sight Imaging with Learnable Physical Priors [52.195637608631955]
Non-line-of-sight (NLOS) imaging has attracted increasing attention due to its potential applications. Existing NLOS reconstruction approaches are constrained by the reliance on empirical physical priors. We introduce a novel learning-based solution, comprising two key designs: Learnable Path Compensation (LPC) and Adaptive Phasor Field (APF)
arXiv Detail & Related papers (2024-09-21T04:39:45Z)
Diffusion-based Light Field Synthesis [50.24624071354433]
LFdiff is a diffusion-based generative framework tailored for LF synthesis. We propose DistgUnet, a disentanglement-based noise estimation network, to harness comprehensive LF representations. Extensive experiments demonstrate that LFdiff excels in synthesizing visually pleasing and disparity-controllable light fields.
arXiv Detail & Related papers (2024-02-01T13:13:16Z)
VQCNIR: Clearer Night Image Restoration with Vector-Quantized Codebook [16.20461368096512]
Night photography often struggles with challenges like low light and blurring, stemming from dark environments and prolonged exposures. We believe in the strength of data-driven high-quality priors and strive to offer a reliable and consistent prior, circumventing the restrictions of manual priors. We propose Clearer Night Image Restoration with Vector-Quantized Codebook (VQCNIR) to achieve remarkable and consistent restoration outcomes on real-world and synthetic benchmarks.
arXiv Detail & Related papers (2023-12-14T02:16:27Z)
Diving into Darkness: A Dual-Modulated Framework for High-Fidelity Super-Resolution in Ultra-Dark Environments [51.58771256128329]
This paper proposes a specialized dual-modulated learning framework that attempts to deeply dissect the nature of the low-light super-resolution task. We develop Illuminance-Semantic Dual Modulation (ISDM) components to enhance feature-level preservation of illumination and color details. Comprehensive experiments showcases the applicability and generalizability of our approach to diverse and challenging ultra-low-light conditions.
arXiv Detail & Related papers (2023-09-11T06:55:32Z)
Self-calibrating Photometric Stereo by Neural Inverse Rendering [88.67603644930466]
This paper tackles the task of uncalibrated photometric stereo for 3D object reconstruction. We propose a new method that jointly optimize object shape, light directions, and light intensities. Our method demonstrates state-of-the-art accuracy in light estimation and shape recovery on real-world datasets.
arXiv Detail & Related papers (2022-07-16T02:46:15Z)
Structured Light with Redundancy Codes [14.828194821588456]
Structured light (SL) systems acquire high-fidelity 3D geometry with active illumination projection. This paper proposes a technique to improve the robustness of SL by projecting redundant optical signals in addition to the native SL patterns.
arXiv Detail & Related papers (2022-06-18T16:52:30Z)
Leveraging Spatial and Photometric Context for Calibrated Non-Lambertian Photometric Stereo [61.6260594326246]
We introduce an efficient fully-convolutional architecture that can leverage both spatial and photometric context simultaneously. Using separable 4D convolutions and 2D heat-maps reduces the size and makes more efficient.
arXiv Detail & Related papers (2021-03-22T18:06:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.