Related papers: See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction

See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction

URL: http://arxiv.org/abs/2505.20641v2
Date: Wed, 28 May 2025 08:56:02 GMT
Title: See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
Authors: Yuan Wu, Zhiqiang Yan, Yigong Zhang, Xiang Li, Jian Yang,
Abstract summary: Occupancy prediction aims to estimate the 3D spatial distribution of occupied regions along with their corresponding semantic labels.<n>We propose textbfLIAR, a novel framework that learns illumination-affined representations.<n>Experiments on both real and synthetic datasets demonstrate the superior performance of LIAR under challenging nighttime scenarios.
Score: 22.695247104302123
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Occupancy prediction aims to estimate the 3D spatial distribution of occupied regions along with their corresponding semantic labels. Existing vision-based methods perform well on daytime benchmarks but struggle in nighttime scenarios due to limited visibility and challenging lighting conditions. To address these challenges, we propose \textbf{LIAR}, a novel framework that learns illumination-affined representations. LIAR first introduces Selective Low-light Image Enhancement (SLLIE), which leverages the illumination priors from daytime scenes to adaptively determine whether a nighttime image is genuinely dark or sufficiently well-lit, enabling more targeted global enhancement. Building on the illumination maps generated by SLLIE, LIAR further incorporates two illumination-aware components: 2D Illumination-guided Sampling (2D-IGS) and 3D Illumination-driven Projection (3D-IDP), to respectively tackle local underexposure and overexposure. Specifically, 2D-IGS modulates feature sampling positions according to illumination maps, assigning larger offsets to darker regions and smaller ones to brighter regions, thereby alleviating feature degradation in underexposed areas. Subsequently, 3D-IDP enhances semantic understanding in overexposed regions by constructing illumination intensity fields and supplying refined residual queries to the BEV context refinement process. Extensive experiments on both real and synthetic datasets demonstrate the superior performance of LIAR under challenging nighttime scenarios. The source code and pretrained models are available \href{https://github.com/yanzq95/LIAR}{here}.

Related papers

SAIGFormer: A Spatially-Adaptive Illumination-Guided Network for Low-Light Image Enhancement [58.79901582809091]
Recent Transformer-based low-light enhancement methods have made promising progress in recovering global illumination.<n>Recent Transformer-based low-light enhancement methods have made promising progress in recovering global illumination.<n>We present a Spatially-Adaptive Illumination-Guided Transformer framework that enables accurate illumination restoration.
arXiv Detail & Related papers (2025-07-21T11:38:56Z)
Dark-EvGS: Event Camera as an Eye for Radiance Field in the Dark [51.68144172958247]
We propose Dark-EvGS, the first event-assisted 3D GS framework that enables the reconstruction of bright frames from arbitrary viewpoints.<n>Our method achieves better results than existing methods, conquering radiance field reconstruction under challenging low-light conditions.
arXiv Detail & Related papers (2025-07-16T05:54:33Z)
MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation [19.46962637673285]
MV-CoLight is a framework for illumination-consistent object compositing in 2D and 3D scenes.<n>We employ a Hilbert curve-based mapping to align 2D image inputs with 3D Gaussian scene representations seamlessly.<n> Experiments demonstrate state-of-the-art harmonized results across standard benchmarks and our dataset.
arXiv Detail & Related papers (2025-05-27T17:53:02Z)
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations [64.07859467542664]
Capturing geometric and material information from images remains a fundamental challenge in computer vision and graphics.<n>Traditional optimization-based methods often require hours of computational time to reconstruct geometry, material properties, and environmental lighting from dense multi-view inputs.<n>We introduce IDArb, a diffusion-based model designed to perform intrinsic decomposition on an arbitrary number of images under varying illuminations.
arXiv Detail & Related papers (2024-12-16T18:52:56Z)
Sun Off, Lights On: Photorealistic Monocular Nighttime Simulation for Robust Semantic Perception [53.631644875171595]
Nighttime scenes are hard to semantically perceive with learned models and annotate for humans. Our method, named Sun Off, Lights On (SOLO), is the first to perform nighttime simulation on single images in a photorealistic fashion by operating in 3D. Not only is the visual quality and photorealism of our nighttime images superior to competing approaches including diffusion models, but the former images are also proven more beneficial for semantic nighttime segmentation in day-to-night adaptation.
arXiv Detail & Related papers (2024-07-29T18:00:09Z)
MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting [63.5925701087252]
Out-of-distribution (OOD) 3D relighting requires novel view synthesis under unseen lighting conditions.<n>We introduce MetaGS to tackle this challenge from two perspectives.
arXiv Detail & Related papers (2024-05-31T13:48:54Z)
Spatiotemporally Consistent HDR Indoor Lighting Estimation [66.26786775252592]
We propose a physically-motivated deep learning framework to solve the indoor lighting estimation problem. Given a single LDR image with a depth map, our method predicts spatially consistent lighting at any given image position. Our framework achieves photorealistic lighting prediction with higher quality compared to state-of-the-art single-image or video-based methods.
arXiv Detail & Related papers (2023-05-07T20:36:29Z)
STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation [12.392842482031558]
We propose a method that jointly learns a nighttime image enhancer and a depth estimator, without using ground truth for either task. Our method tightly entangles two self-supervised tasks using a newly proposed uncertain pixel masking strategy. We benchmark the method on two established datasets: nuScenes and RobotCar.
arXiv Detail & Related papers (2023-02-02T18:59:47Z)
Consistent Depth Prediction under Various Illuminations using Dilated Cross Attention [1.332560004325655]
We propose to use internet 3D indoor scenes and manually tune their illuminations to render photo-realistic RGB photos and their corresponding depth and BRDF maps. We perform cross attention on these dilated features to retain the consistency of depth prediction under different illuminations. Our method is evaluated by comparing it with current state-of-the-art methods on Vari dataset and a significant improvement is observed in experiments.
arXiv Detail & Related papers (2021-12-15T10:02:46Z)
Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Dark [20.66405067066299]
We introduce Priors-Based Regularization to learn distribution knowledge from unpaired depth maps. We also leverage Mapping-Consistent Image Enhancement module to enhance image visibility and contrast. Our framework achieves remarkable improvements and state-of-the-art results on two nighttime datasets.
arXiv Detail & Related papers (2021-08-09T06:24:35Z)
Unsupervised Low-light Image Enhancement with Decoupled Networks [103.74355338972123]
We learn a two-stage GAN-based framework to enhance the real-world low-light images in a fully unsupervised fashion. Our proposed method outperforms the state-of-the-art unsupervised image enhancement methods in terms of both illumination enhancement and noise reduction.
arXiv Detail & Related papers (2020-05-06T13:37:08Z)
Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination [84.00096195633793]
We present a deep learning solution for estimating the incident illumination at any 3D location within a scene from an input narrow-baseline stereo image pair. Our model is trained without any ground truth 3D data and only requires a held-out perspective view near the input stereo pair and a spherical panorama taken within each scene as supervision. We demonstrate that our method can predict consistent spatially-varying lighting that is convincing enough to plausibly relight and insert highly specular virtual objects into real images.
arXiv Detail & Related papers (2020-03-18T17:46:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.