ISALux: Illumination and Segmentation Aware Transformer Employing Mixture of Experts for Low Light Image Enhancement
- URL: http://arxiv.org/abs/2508.17885v1
- Date: Mon, 25 Aug 2025 10:47:18 GMT
- Title: ISALux: Illumination and Segmentation Aware Transformer Employing Mixture of Experts for Low Light Image Enhancement
- Authors: Raul Balmez, Alexandru Brateanu, Ciprian Orhei, Codruta Ancuti, Cosmin Ancuti,
- Abstract summary: ISALux is a transformer-based approach for Low-Light Image Enhancement (LLIE)<n>HISA-MSA integrates illumination and semantic segmentation maps for feature extraction.<n>MoE-based Feed-Forward Network (FFN) enhances contextual learning.
- Score: 39.24835095169737
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce ISALux, a novel transformer-based approach for Low-Light Image Enhancement (LLIE) that seamlessly integrates illumination and semantic priors. Our architecture includes an original self-attention block, Hybrid Illumination and Semantics-Aware Multi-Headed Self- Attention (HISA-MSA), which integrates illumination and semantic segmentation maps for en- hanced feature extraction. ISALux employs two self-attention modules to independently process illumination and semantic features, selectively enriching each other to regulate luminance and high- light structural variations in real-world scenarios. A Mixture of Experts (MoE)-based Feed-Forward Network (FFN) enhances contextual learning, with a gating mechanism conditionally activating the top K experts for specialized processing. To address overfitting in LLIE methods caused by distinct light patterns in benchmarking datasets, we enhance the HISA-MSA module with low-rank matrix adaptations (LoRA). Extensive qualitative and quantitative evaluations across multiple specialized datasets demonstrate that ISALux is competitive with state-of-the-art (SOTA) methods. Addition- ally, an ablation study highlights the contribution of each component in the proposed model. Code will be released upon publication.
Related papers
- Adaptive Global and Fine-Grained Perceptual Fusion for MLLM Embeddings Compatible with Hard Negative Amplification [49.109117617514066]
Multimodal embeddings serve as a bridge for aligning vision and language.<n>We propose Adaptive Global and Fine-grained perceptual Fusion for MLLM Embeddings.<n>AGFF-Embed comprehensively achieves state-of-the-art performance in both general and fine-grained understanding.
arXiv Detail & Related papers (2026-02-05T14:52:35Z) - Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement [41.66776033752888]
Most low-light image enhancement methods rely on pre-trained model priors, low-light inputs, or both.<n>We propose VLM-IMI, a novel framework that leverages large vision-language models with iterative and manual instructions.<n>VLM-IMI incorporates textual descriptions of the desired normal-light content as enhancement cues, enabling semantically informed restoration.
arXiv Detail & Related papers (2025-07-24T03:35:20Z) - SAIGFormer: A Spatially-Adaptive Illumination-Guided Network for Low-Light Image Enhancement [58.79901582809091]
Recent Transformer-based low-light enhancement methods have made promising progress in recovering global illumination.<n>Recent Transformer-based low-light enhancement methods have made promising progress in recovering global illumination.<n>We present a Spatially-Adaptive Illumination-Guided Transformer framework that enables accurate illumination restoration.
arXiv Detail & Related papers (2025-07-21T11:38:56Z) - Low-Light Enhancement via Encoder-Decoder Network with Illumination Guidance [0.0]
This paper introduces a novel deep learning framework for low-light image enhancement, named the.<n>the-Decoder Network with Illumination Guidance (EDNIG)<n>EDNIG integrates an illumination map, derived from Bright Channel Prior (BCP), as a guidance input.<n>It is optimized within a Generative Adversarial Network (GAN) framework using a composite loss function that combines adversarial loss, pixel-wise mean squared error (MSE), and perceptual loss.
arXiv Detail & Related papers (2025-07-04T09:35:00Z) - Graft: Integrating the Domain Knowledge via Efficient Parameter Synergy for MLLMs [56.76586846269894]
Multimodal Large Language Models (MLLMs) have achieved success across various domains.<n>Despite its importance, the study of knowledge sharing among domain-specific MLLMs remains largely underexplored.<n>We propose a unified parameter integration framework that enables modular composition of expert capabilities.
arXiv Detail & Related papers (2025-06-30T15:07:41Z) - ALEN: A Dual-Approach for Uniform and Non-Uniform Low-Light Image Enhancement [10.957431540794836]
Inadequate illumination can lead to significant information loss and poor image quality, impacting various applications such as surveillance.<n>Current enhancement techniques often use specific datasets to enhance low-light images, but still present challenges when adapting to diverse real-world conditions.<n>The Adaptive Light Enhancement Network (ALEN) is introduced, whose main approach is the use of a classification mechanism to determine whether local or global illumination enhancement is required.
arXiv Detail & Related papers (2024-07-29T05:19:23Z) - Contrastive Learning-Based Spectral Knowledge Distillation for
Multi-Modality and Missing Modality Scenarios in Semantic Segmentation [2.491548070992611]
novel multi-modal fusion approach called CSK-Net is proposed.
It uses a contrastive learning-based spectral knowledge distillation technique.
Experiments show that CSK-Net surpasses state-of-the-art models in multi-modal tasks and for missing modalities.
arXiv Detail & Related papers (2023-12-04T10:27:09Z) - Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network [52.77569396659629]
This paper presents the deep compensation network unfolding (DCUNet) for restoring light field (LF) images captured under low-light conditions.
The framework uses the intermediate enhanced result to estimate the illumination map, which is then employed in the unfolding process to produce a new enhanced result.
To properly leverage the unique characteristics of LF images, this paper proposes a pseudo-explicit feature interaction module.
arXiv Detail & Related papers (2023-08-10T07:53:06Z) - Learning Semantic-Aware Knowledge Guidance for Low-Light Image
Enhancement [69.47143451986067]
Low-light image enhancement (LLIE) investigates how to improve illumination and produce normal-light images.
The majority of existing methods improve low-light images via a global and uniform manner, without taking into account the semantic information of different regions.
We propose a novel semantic-aware knowledge-guided framework that can assist a low-light enhancement model in learning rich and diverse priors encapsulated in a semantic segmentation model.
arXiv Detail & Related papers (2023-04-14T10:22:28Z) - Adaptive Multiscale Illumination-Invariant Feature Representation for
Undersampled Face Recognition [29.002873450422083]
This paper presents an illumination-invariant feature representation approach used to eliminate the varying illumination affection in undersampled face recognition.
A new illumination level classification technique based on Singular Value Decomposition (SVD) is proposed to judge the illumination level of input image.
The experimental results demonstrate that the JLEF-feature and AJLEF-face outperform other related approaches for undersampled face recognition under varying illumination.
arXiv Detail & Related papers (2020-04-07T06:48:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.