CLIP-Guided Unsupervised Semantic-Aware Exposure Correction
- URL: http://arxiv.org/abs/2601.19129v2
- Date: Wed, 28 Jan 2026 11:08:04 GMT
- Title: CLIP-Guided Unsupervised Semantic-Aware Exposure Correction
- Authors: Puzhen Wu, Han Weng, Quan Zheng, Yi Zhan, Hewei Wang, Yiming Li, Jiahui Han, Rui Xu,
- Abstract summary: A new unsupervised semantic-aware exposure correction network is proposed.<n>It fuses semantic information extracted from a pre-trained Fast Segment Anything Model into a shared image feature space.<n>A pseudo-ground truth generator guided by CLIP is fine-tuned to automatically identify exposure situations and instruct the tailored corrections.
- Score: 13.05173129182012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improper exposure often leads to severe loss of details, color distortion, and reduced contrast. Exposure correction still faces two critical challenges: (1) the ignorance of object-wise regional semantic information causes the color shift artifacts; (2) real-world exposure images generally have no ground-truth labels, and its labeling entails massive manual editing. To tackle the challenges, we propose a new unsupervised semantic-aware exposure correction network. It contains an adaptive semantic-aware fusion module, which effectively fuses the semantic information extracted from a pre-trained Fast Segment Anything Model into a shared image feature space. Then the fused features are used by our multi-scale residual spatial mamba group to restore the details and adjust the exposure. To avoid manual editing, we propose a pseudo-ground truth generator guided by CLIP, which is fine-tuned to automatically identify exposure situations and instruct the tailored corrections. Also, we leverage the rich priors from the FastSAM and CLIP to develop a semantic-prompt consistency loss to enforce semantic consistency and image-prompt alignment for unsupervised training. Comprehensive experimental results illustrate the effectiveness of our method in correcting real-world exposure images and outperforms state-of-the-art unsupervised methods both numerically and visually.
Related papers
- LoopExpose: An Unsupervised Framework for Arbitrary-Length Exposure Correction [43.00059667275665]
We propose a pseudo label-based unsupervised method called LoopExpose for arbitrary-length exposure correction.<n>A nested loop optimization strategy is proposed to address the exposure correction problem.<n>Experiments on different benchmark datasets demonstrate that LoopExpose achieves superior exposure correction and fusion performance.
arXiv Detail & Related papers (2025-11-08T16:36:52Z) - Leveraging Hierarchical Image-Text Misalignment for Universal Fake Image Detection [58.927873049646024]
We show that fake images cannot be properly aligned with corresponding captions compared to real images.<n>We propose a simple yet effective ITEM by leveraging the image-text misalignment in a joint visual-language space as discriminative clues.
arXiv Detail & Related papers (2025-11-01T06:51:14Z) - WEC-DG: Multi-Exposure Wavelet Correction Method Guided by Degradation Description [7.873244458995218]
Multi-exposure correction technology is essential for restoring images affected by insufficient or excessive lighting.<n>Current multi-exposure correction methods often encounter challenges in addressing intra-class variability caused by diverse lighting conditions.<n>This paper proposes a Wavelet-based Exposure Correction method with Degradation Guidance (WEC-DG)
arXiv Detail & Related papers (2025-08-13T07:31:44Z) - Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion [52.315729095824906]
MLLM Semantic-Corrected Ping-Pong-Ahead Diffusion (PPAD) is a novel framework that introduces a Multimodal Large Language Model (MLLM) as a semantic observer during inference.<n>It performs real-time analysis on intermediate generations, identifies latent semantic inconsistencies, and translates feedback into controllable signals that actively guide the remaining denoising steps.<n>Extensive experiments demonstrate PPAD's significant improvements.
arXiv Detail & Related papers (2025-05-26T14:42:35Z) - Pseudo-Label Guided Real-World Image De-weathering: A Learning Framework with Imperfect Supervision [57.5699142476311]
We propose a unified solution for real-world image de-weathering with non-ideal supervision.<n>Our method exhibits significant advantages when trained on imperfectly aligned de-weathering datasets.
arXiv Detail & Related papers (2025-04-14T07:24:03Z) - UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint [87.20985852686785]
We propose an unsupervised instruction-based image editing approach that removes the need for ground-truth edited images during training.<n>Our approach addresses these challenges by introducing a novel editing mechanism called Edit Reversibility Constraint (ERC), which applies forward and reverse edits in one training step.<n>This allows us to bypass the need for ground-truth edited images and unlock training for the first time on datasets comprising either real image-caption pairs or image-caption-instruction triplets.
arXiv Detail & Related papers (2024-12-19T18:59:58Z) - Learning Camouflaged Object Detection from Noisy Pseudo Label [60.9005578956798]
This paper introduces the first weakly semi-supervised Camouflaged Object Detection (COD) method.
It aims for budget-efficient and high-precision camouflaged object segmentation with an extremely limited number of fully labeled images.
We propose a noise correction loss that facilitates the model's learning of correct pixels in the early learning stage.
When using only 20% of fully labeled data, our method shows superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T04:53:51Z) - Region-Aware Exposure Consistency Network for Mixed Exposure Correction [26.30138794484646]
We introduce an effective Region-aware Exposure Correction Network (RECNet) that can handle mixed exposure.
We develop a region-aware de-exposure module that effectively translates regional features of mixed exposure scenarios into an exposure-invariant feature space.
We propose an exposure contrastive regularization strategy under the constraints of intra-regional exposure consistency and inter-regional exposure continuity.
arXiv Detail & Related papers (2024-02-28T10:24:36Z) - Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer
for Exposure Correction [65.5397271106534]
A single neural network is difficult to handle all exposure problems.
In particular, convolutions hinder the ability to restore faithful color or details on extremely over-/under- exposed regions.
We propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction.
arXiv Detail & Related papers (2023-09-02T09:07:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.