Dynamic Weight-based Temporal Aggregation for Low-light Video Enhancement
- URL: http://arxiv.org/abs/2510.09450v1
- Date: Fri, 10 Oct 2025 15:00:31 GMT
- Title: Dynamic Weight-based Temporal Aggregation for Low-light Video Enhancement
- Authors: Ruirui Lin, Guoxi Huang, Nantheera Anantrasirichai,
- Abstract summary: Low-light video enhancement is challenging due to noise, low contrast, and color degradations.<n>We present DWTA-Net, a novel framework that exploits short- and long-term temporal cues.<n>We show that DWTA-Net effectively suppresses noise and artifacts, delivering superior visual quality compared with state-of-the-art methods.
- Score: 6.8703489542630445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-light video enhancement (LLVE) is challenging due to noise, low contrast, and color degradations. Learning-based approaches offer fast inference but still struggle with heavy noise in real low-light scenes, primarily due to limitations in effectively leveraging temporal information. In this paper, we address this issue with DWTA-Net, a novel two-stage framework that jointly exploits short- and long-term temporal cues. Stage I employs Visual State-Space blocks for multi-frame alignment, recovering brightness, color, and structure with local consistency. Stage II introduces a recurrent refinement module with dynamic weight-based temporal aggregation guided by optical flow, adaptively balancing static and dynamic regions. A texture-adaptive loss further preserves fine details while promoting smoothness in flat areas. Experiments on real-world low-light videos show that DWTA-Net effectively suppresses noise and artifacts, delivering superior visual quality compared with state-of-the-art methods.
Related papers
- LuxDiT: Lighting Estimation with Video Diffusion Transformer [66.60450792095901]
Estimating scene lighting from a single image or video remains a longstanding challenge in computer vision and graphics.<n>We propose LuxDiT, a novel data-driven approach that fine-tunes a video diffusion transformer to generate HDR environment maps conditioned on visual input.
arXiv Detail & Related papers (2025-09-03T19:59:20Z) - Robust Low-light Scene Restoration via Illumination Transition [40.41013083877581]
Existing low-light enhancement methods often struggle to effectively preprocess such low-light inputs.<n>We propose a novel Robust Low-light Scene Restoration framework (RoSe)<n> Experiments demonstrate that RoSe significantly outperforms state-of-the-art models in both rendering quality and multiview consistency.
arXiv Detail & Related papers (2025-07-05T10:02:30Z) - Zero-TIG: Temporal Consistency-Aware Zero-Shot Illumination-Guided Low-light Video Enhancement [2.9695823613761316]
Low-light and underwater videos suffer from poor visibility, low contrast, and high noise.<n>Existing approaches typically rely on paired ground truth, which limits their practicality and often fails to maintain temporal consistency.<n>This paper introduces a novel zero-shot learning approach named Zero-TIG, leveraging the Retinex theory and optical flow techniques.
arXiv Detail & Related papers (2025-03-14T08:22:26Z) - Rethinking High-speed Image Reconstruction Framework with Spike Camera [48.627095354244204]
Spike cameras generate continuous spike streams to capture high-speed scenes with lower bandwidth and higher dynamic range than traditional RGB cameras.<n>We introduce a novel spike-to-image reconstruction framework SpikeCLIP that goes beyond traditional training paradigms.<n>Our experiments on real-world low-light datasets demonstrate that SpikeCLIP significantly enhances texture details and the luminance balance of recovered images.
arXiv Detail & Related papers (2025-01-08T13:00:17Z) - Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement [48.76608212565327]
This paper makes endeavors in the direction of learning for low-light video enhancement without using paired ground truth.
Compared to low-light image enhancement, enhancing low-light videos is more difficult due to the intertwined effects of noise, exposure, and contrast in the spatial domain, jointly with the need for temporal coherence.
We propose the Unrolled Decomposed Unpaired Network (UDU-Net) for enhancing low-light videos by unrolling the optimization functions into a deep network to decompose the signal into spatial and temporal-related factors, which are updated iteratively.
arXiv Detail & Related papers (2024-08-22T11:45:11Z) - Low-Light Video Enhancement via Spatial-Temporal Consistent Decomposition [52.89441679581216]
Low-Light Video Enhancement (LLVE) seeks to restore dynamic or static scenes plagued by severe invisibility and noise.<n>We present an innovative video decomposition strategy that incorporates view-independent and view-dependent components.<n>Our framework consistently outperforms existing methods, establishing a new SOTA performance.
arXiv Detail & Related papers (2024-05-24T15:56:40Z) - LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models [54.93010869546011]
We propose to leverage the pre-trained latent diffusion model to perform the neural ISP for enhancing extremely low-light images.<n>Specifically, to tailor the pre-trained latent diffusion model to operate on the RAW domain, we train a set of lightweight taming modules.<n>We observe different roles of UNet denoising and decoder reconstruction in the latent diffusion model, which inspires us to decompose the low-light image enhancement task into latent-space low-frequency content generation and decoding-phase high-frequency detail maintenance.
arXiv Detail & Related papers (2023-12-02T04:31:51Z) - Advancing Unsupervised Low-light Image Enhancement: Noise Estimation, Illumination Interpolation, and Self-Regulation [55.07472635587852]
Low-Light Image Enhancement (LLIE) techniques have made notable advancements in preserving image details and enhancing contrast.
These approaches encounter persistent challenges in efficiently mitigating dynamic noise and accommodating diverse low-light scenarios.
We first propose a method for estimating the noise level in low light images in a quick and accurate way.
We then devise a Learnable Illumination Interpolator (LII) to satisfy general constraints between illumination and input.
arXiv Detail & Related papers (2023-05-17T13:56:48Z) - Physics Informed Neural Fields for Smoke Reconstruction with Sparse Data [73.8970871148949]
High-fidelity reconstruction of fluids from sparse multiview RGB videos remains a formidable challenge.
Existing solutions either assume knowledge of obstacles and lighting, or only focus on simple fluid scenes without obstacles or complex lighting.
We present the first method to reconstruct dynamic fluid by leveraging the governing physics (ie, Navier -Stokes equations) in an end-to-end optimization.
arXiv Detail & Related papers (2022-06-14T03:38:08Z) - Adaptive Unfolding Total Variation Network for Low-Light Image
Enhancement [6.531546527140475]
Most existing enhancing algorithms in sRGB space only focus on the low visibility problem or suppress the noise under a hypothetical noise level.
We propose an adaptive unfolding total variation network (UTVNet) to approximate the noise level from the real sRGB low-light image.
Experiments on real-world low-light images clearly demonstrate the superior performance of UTVNet over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-03T11:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.