CIVQLLIE: Causal Intervention with Vector Quantization for Low-Light Image Enhancement
- URL: http://arxiv.org/abs/2508.03338v1
- Date: Tue, 05 Aug 2025 11:36:39 GMT
- Title: CIVQLLIE: Causal Intervention with Vector Quantization for Low-Light Image Enhancement
- Authors: Tongshun Zhang, Pingping Liu, Zhe Zhang, Qiuzhan Zhou,
- Abstract summary: Current low-light image enhancement methods face significant challenges.<n>We propose CIVQLLIE, a novel framework that leverages the power of discrete representation learning through causal reasoning.
- Score: 5.948286668586509
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Images captured in nighttime scenes suffer from severely reduced visibility, hindering effective content perception. Current low-light image enhancement (LLIE) methods face significant challenges: data-driven end-to-end mapping networks lack interpretability or rely on unreliable prior guidance, struggling under extremely dark conditions, while physics-based methods depend on simplified assumptions that often fail in complex real-world scenarios. To address these limitations, we propose CIVQLLIE, a novel framework that leverages the power of discrete representation learning through causal reasoning. We achieve this through Vector Quantization (VQ), which maps continuous image features to a discrete codebook of visual tokens learned from large-scale high-quality images. This codebook serves as a reliable prior, encoding standardized brightness and color patterns that are independent of degradation. However, direct application of VQ to low-light images fails due to distribution shifts between degraded inputs and the learned codebook. Therefore, we propose a multi-level causal intervention approach to systematically correct these shifts. First, during encoding, our Pixel-level Causal Intervention (PCI) module intervenes to align low-level features with the brightness and color distributions expected by the codebook. Second, a Feature-aware Causal Intervention (FCI) mechanism with Low-frequency Selective Attention Gating (LSAG) identifies and enhances channels most affected by illumination degradation, facilitating accurate codebook token matching while enhancing the encoder's generalization performance through flexible feature-level intervention. Finally, during decoding, the High-frequency Detail Reconstruction Module (HDRM) leverages structural information preserved in the matched codebook representations to reconstruct fine details using deformable convolution techniques.
Related papers
- From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning [65.94580484237737]
Low-light enhancement improves image quality for downstream tasks, but existing methods rely on physical or geometric priors.<n>We build a generalized bridge between low-light enhancement and low-light understanding, which we term Generalized Enhancement For Understanding (GEFU)<n>To address the diverse causes of low-light degradation, we leverage pretrained generative diffusion models to optimize images, achieving zero-shot generalization performance.
arXiv Detail & Related papers (2025-07-11T07:51:26Z) - GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval [80.96706764868898]
We present a new Low-light Image Enhancement (LLIE) network via Generative LAtent feature based codebook REtrieval (GLARE)
We develop a generative Invertible Latent Normalizing Flow (I-LNF) module to align the LL feature distribution to NL latent representations, guaranteeing the correct code retrieval in the codebook.
Experiments confirm the superior performance of GLARE on various benchmark datasets and real-world data.
arXiv Detail & Related papers (2024-07-17T09:40:15Z) - Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation [52.82508784748278]
This paper proposes a Control Generative Image Compression framework, termed Control-GIC.<n>Control-GIC is capable of fine-grained adaption across a broad spectrum while ensuring high-fidelity and generality compression.<n>Our experiments show that Control-GIC allows highly flexible and controllable adaption where the results demonstrate its superior performance over recent state-of-the-art methods.
arXiv Detail & Related papers (2024-06-02T14:22:09Z) - HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression [51.04820313355164]
HyrbidFlow combines the continuous-feature-based and codebook-based streams to achieve both high perceptual quality and high fidelity under extreme lows.
Experimental results demonstrate superior performance across several datasets under extremely lows.
arXiv Detail & Related papers (2024-04-20T13:19:08Z) - CodeEnhance: A Codebook-Driven Approach for Low-Light Image Enhancement [97.95330185793358]
Low-light image enhancement (LLIE) aims to improve low-illumination images.<n>Existing methods face two challenges: uncertainty in restoration from diverse brightness degradations and loss of texture and color information.<n>We propose a novel enhancement approach, CodeEnhance, by leveraging quantized priors and image refinement.
arXiv Detail & Related papers (2024-04-08T07:34:39Z) - VQCNIR: Clearer Night Image Restoration with Vector-Quantized Codebook [16.20461368096512]
Night photography often struggles with challenges like low light and blurring, stemming from dark environments and prolonged exposures.
We believe in the strength of data-driven high-quality priors and strive to offer a reliable and consistent prior, circumventing the restrictions of manual priors.
We propose Clearer Night Image Restoration with Vector-Quantized Codebook (VQCNIR) to achieve remarkable and consistent restoration outcomes on real-world and synthetic benchmarks.
arXiv Detail & Related papers (2023-12-14T02:16:27Z) - Towards Robust Blind Face Restoration with Codebook Lookup Transformer [94.48731935629066]
Blind face restoration is a highly ill-posed problem that often requires auxiliary guidance.
We show that a learned discrete codebook prior in a small proxy space cast blind face restoration as a code prediction task.
We propose a Transformer-based prediction network, named CodeFormer, to model global composition and context of the low-quality faces.
arXiv Detail & Related papers (2022-06-22T17:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.