Low-Light Video Enhancement via Spatial-Temporal Consistent Illumination and Reflection Decomposition
- URL: http://arxiv.org/abs/2405.15660v1
- Date: Fri, 24 May 2024 15:56:40 GMT
- Title: Low-Light Video Enhancement via Spatial-Temporal Consistent Illumination and Reflection Decomposition
- Authors: Xiaogang Xu, Kun Zhou, Tao Hu, Ruixing Wang, Hujun Bao,
- Abstract summary: Low-Light Video Enhancement (LLVE) seeks to restore dynamic and static scenes plagued by severe invisibility and noise.
One critical aspect is formulating a consistency constraint specifically for temporal-spatial illumination and appearance enhanced versions.
We present an innovative video Retinex-based decomposition strategy that operates without the need for explicit supervision.
- Score: 68.6707284662443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-Light Video Enhancement (LLVE) seeks to restore dynamic and static scenes plagued by severe invisibility and noise. One critical aspect is formulating a consistency constraint specifically for temporal-spatial illumination and appearance enhanced versions, a dimension overlooked in existing methods. In this paper, we present an innovative video Retinex-based decomposition strategy that operates without the need for explicit supervision to delineate illumination and reflectance components. We leverage dynamic cross-frame correspondences for intrinsic appearance and enforce a scene-level continuity constraint on the illumination field to yield satisfactory consistent decomposition results. To further ensure consistent decomposition, we introduce a dual-structure enhancement network featuring a novel cross-frame interaction mechanism. This mechanism can seamlessly integrate with encoder-decoder single-frame networks, incurring minimal additional parameter costs. By supervising different frames simultaneously, this network encourages them to exhibit matching decomposition features, thus achieving the desired temporal propagation. Extensive experiments are conducted on widely recognized LLVE benchmarks, covering diverse scenarios. Our framework consistently outperforms existing methods, establishing a new state-of-the-art (SOTA) performance.
Related papers
- Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World
Video Super-Resolution [65.91317390645163]
Upscale-A-Video is a text-guided latent diffusion framework for video upscaling.
It ensures temporal coherence through two key mechanisms: locally, it integrates temporal layers into U-Net and VAE-Decoder, maintaining consistency within short sequences.
It also offers greater flexibility by allowing text prompts to guide texture creation and adjustable noise levels to balance restoration and generation.
arXiv Detail & Related papers (2023-12-11T18:54:52Z) - Reti-Diff: Illumination Degradation Image Restoration with Retinex-based
Latent Diffusion Model [59.08821399652483]
Illumination degradation image restoration (IDIR) techniques aim to improve the visibility of degraded images and mitigate the adverse effects of deteriorated illumination.
Among these algorithms, diffusion model (DM)-based methods have shown promising performance but are often burdened by heavy computational demands and pixel misalignment issues when predicting the image-level distribution.
We propose to leverage DM within a compact latent space to generate concise guidance priors and introduce a novel solution called Reti-Diff for the IDIR task.
Reti-Diff comprises two key components: the Retinex-based latent DM (RLDM) and the Retinex-guided transformer (RG
arXiv Detail & Related papers (2023-11-20T09:55:06Z) - Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video
Restoration [78.14941737723501]
We propose a Cross-consistent Deep Unfolding Network (CDUN) for All-In-One VR.
By orchestrating two cascading procedures, CDUN achieves adaptive processing for diverse degradations.
In addition, we introduce a window-based inter-frame fusion strategy to utilize information from more adjacent frames.
arXiv Detail & Related papers (2023-09-04T14:18:00Z) - Temporal Consistency Learning of inter-frames for Video Super-Resolution [38.26035126565062]
Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames.
Existing methods generally explore information propagation and frame alignment to improve the performance of VSR.
We propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos.
arXiv Detail & Related papers (2022-11-03T08:23:57Z) - IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable
Novel View Synthesis [90.03590032170169]
We present intrinsic neural radiance fields, dubbed IntrinsicNeRF, which introduce intrinsic decomposition into the NeRF-based neural rendering method.
Our experiments and editing samples on both object-specific/room-scale scenes and synthetic/real-word data demonstrate that we can obtain consistent intrinsic decomposition results.
arXiv Detail & Related papers (2022-10-02T22:45:11Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z) - Coarse to Fine Multi-Resolution Temporal Convolutional Network [25.08516972520265]
We propose a novel temporal encoder-decoder to tackle the problem of sequence fragmentation.
The decoder follows a coarse-to-fine structure with an implicit ensemble of multiple temporal resolutions.
Experiments show that our stand-alone architecture, together with our novel feature-augmentation strategy and new loss, outperforms the state-of-the-art on three temporal video segmentation benchmarks.
arXiv Detail & Related papers (2021-05-23T06:07:40Z) - Neural Video Portrait Relighting in Real-time via Consistency Modeling [41.04622998356025]
We propose a neural approach for real-time, high-quality and coherent video portrait relighting.
We propose a hybrid structure and lighting disentanglement in an encoder-decoder architecture.
We also propose a lighting sampling strategy to model the illumination consistency and mutation for natural portrait light manipulation in real-world.
arXiv Detail & Related papers (2021-04-01T14:13:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.