Related papers: CLIP-Guided Unsupervised Semantic-Aware Exposure Correction

CLIP-Guided Unsupervised Semantic-Aware Exposure Correction

URL: http://arxiv.org/abs/2601.19129v2
Date: Wed, 28 Jan 2026 11:08:04 GMT
Title: CLIP-Guided Unsupervised Semantic-Aware Exposure Correction
Authors: Puzhen Wu, Han Weng, Quan Zheng, Yi Zhan, Hewei Wang, Yiming Li, Jiahui Han, Rui Xu,
Abstract summary: A new unsupervised semantic-aware exposure correction network is proposed.<n>It fuses semantic information extracted from a pre-trained Fast Segment Anything Model into a shared image feature space.<n>A pseudo-ground truth generator guided by CLIP is fine-tuned to automatically identify exposure situations and instruct the tailored corrections.
Score: 13.05173129182012
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Improper exposure often leads to severe loss of details, color distortion, and reduced contrast. Exposure correction still faces two critical challenges: (1) the ignorance of object-wise regional semantic information causes the color shift artifacts; (2) real-world exposure images generally have no ground-truth labels, and its labeling entails massive manual editing. To tackle the challenges, we propose a new unsupervised semantic-aware exposure correction network. It contains an adaptive semantic-aware fusion module, which effectively fuses the semantic information extracted from a pre-trained Fast Segment Anything Model into a shared image feature space. Then the fused features are used by our multi-scale residual spatial mamba group to restore the details and adjust the exposure. To avoid manual editing, we propose a pseudo-ground truth generator guided by CLIP, which is fine-tuned to automatically identify exposure situations and instruct the tailored corrections. Also, we leverage the rich priors from the FastSAM and CLIP to develop a semantic-prompt consistency loss to enforce semantic consistency and image-prompt alignment for unsupervised training. Comprehensive experimental results illustrate the effectiveness of our method in correcting real-world exposure images and outperforms state-of-the-art unsupervised methods both numerically and visually.

Related papers

LoopExpose: An Unsupervised Framework for Arbitrary-Length Exposure Correction [43.00059667275665]
We propose a pseudo label-based unsupervised method called LoopExpose for arbitrary-length exposure correction.<n>A nested loop optimization strategy is proposed to address the exposure correction problem.<n>Experiments on different benchmark datasets demonstrate that LoopExpose achieves superior exposure correction and fusion performance.
arXiv Detail & Related papers (2025-11-08T16:36:52Z)
Leveraging Hierarchical Image-Text Misalignment for Universal Fake Image Detection [58.927873049646024]
We show that fake images cannot be properly aligned with corresponding captions compared to real images.<n>We propose a simple yet effective ITEM by leveraging the image-text misalignment in a joint visual-language space as discriminative clues.
arXiv Detail & Related papers (2025-11-01T06:51:14Z)
WEC-DG: Multi-Exposure Wavelet Correction Method Guided by Degradation Description [7.873244458995218]
Multi-exposure correction technology is essential for restoring images affected by insufficient or excessive lighting.<n>Current multi-exposure correction methods often encounter challenges in addressing intra-class variability caused by diverse lighting conditions.<n>This paper proposes a Wavelet-based Exposure Correction method with Degradation Guidance (WEC-DG)
arXiv Detail & Related papers (2025-08-13T07:31:44Z)
Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion [52.315729095824906]
MLLM Semantic-Corrected Ping-Pong-Ahead Diffusion (PPAD) is a novel framework that introduces a Multimodal Large Language Model (MLLM) as a semantic observer during inference.<n>It performs real-time analysis on intermediate generations, identifies latent semantic inconsistencies, and translates feedback into controllable signals that actively guide the remaining denoising steps.<n>Extensive experiments demonstrate PPAD's significant improvements.
arXiv Detail & Related papers (2025-05-26T14:42:35Z)
Pseudo-Label Guided Real-World Image De-weathering: A Learning Framework with Imperfect Supervision [57.5699142476311]
We propose a unified solution for real-world image de-weathering with non-ideal supervision.<n>Our method exhibits significant advantages when trained on imperfectly aligned de-weathering datasets.
arXiv Detail & Related papers (2025-04-14T07:24:03Z)
UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint [87.20985852686785]
We propose an unsupervised instruction-based image editing approach that removes the need for ground-truth edited images during training.<n>Our approach addresses these challenges by introducing a novel editing mechanism called Edit Reversibility Constraint (ERC), which applies forward and reverse edits in one training step.<n>This allows us to bypass the need for ground-truth edited images and unlock training for the first time on datasets comprising either real image-caption pairs or image-caption-instruction triplets.
arXiv Detail & Related papers (2024-12-19T18:59:58Z)
Learning Camouflaged Object Detection from Noisy Pseudo Label [60.9005578956798]
This paper introduces the first weakly semi-supervised Camouflaged Object Detection (COD) method. It aims for budget-efficient and high-precision camouflaged object segmentation with an extremely limited number of fully labeled images. We propose a noise correction loss that facilitates the model's learning of correct pixels in the early learning stage. When using only 20% of fully labeled data, our method shows superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T04:53:51Z)
Region-Aware Exposure Consistency Network for Mixed Exposure Correction [26.30138794484646]
We introduce an effective Region-aware Exposure Correction Network (RECNet) that can handle mixed exposure. We develop a region-aware de-exposure module that effectively translates regional features of mixed exposure scenarios into an exposure-invariant feature space. We propose an exposure contrastive regularization strategy under the constraints of intra-regional exposure consistency and inter-regional exposure continuity.
arXiv Detail & Related papers (2024-02-28T10:24:36Z)
Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction [65.5397271106534]
A single neural network is difficult to handle all exposure problems. In particular, convolutions hinder the ability to restore faithful color or details on extremely over-/under- exposed regions. We propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction.
arXiv Detail & Related papers (2023-09-02T09:07:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.