CLIP-Optimized Multimodal Image Enhancement via ISP-CNN Fusion for Coal Mine IoVT under Uneven Illumination
- URL: http://arxiv.org/abs/2502.19450v1
- Date: Wed, 26 Feb 2025 05:09:40 GMT
- Title: CLIP-Optimized Multimodal Image Enhancement via ISP-CNN Fusion for Coal Mine IoVT under Uneven Illumination
- Authors: Shuai Wang, Shihao Zhang, Jiaqi Wu, Zijian Tian, Wei Chen, Tongzhu Jin, Miaomiao Xue, Zehua Wang, Fei Richard Yu, Victor C. M. Leung,
- Abstract summary: Low illumination and uneven brightness in underground environments significantly degrade image quality.<n>We propose a multimodal image enhancement method tailored for coal mine IoVT, utilizing an ISP-CNN fusion architecture optimized for uneven illumination.
- Score: 40.70282870053005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clear monitoring images are crucial for the safe operation of coal mine Internet of Video Things (IoVT) systems. However, low illumination and uneven brightness in underground environments significantly degrade image quality, posing challenges for enhancement methods that often rely on difficult-to-obtain paired reference images. Additionally, there is a trade-off between enhancement performance and computational efficiency on edge devices within IoVT systems.To address these issues, we propose a multimodal image enhancement method tailored for coal mine IoVT, utilizing an ISP-CNN fusion architecture optimized for uneven illumination. This two-stage strategy combines global enhancement with detail optimization, effectively improving image quality, especially in poorly lit areas. A CLIP-based multimodal iterative optimization allows for unsupervised training of the enhancement algorithm. By integrating traditional image signal processing (ISP) with convolutional neural networks (CNN), our approach reduces computational complexity while maintaining high performance, making it suitable for real-time deployment on edge devices.Experimental results demonstrate that our method effectively mitigates uneven brightness and enhances key image quality metrics, with PSNR improvements of 2.9%-4.9%, SSIM by 4.3%-11.4%, and VIF by 4.9%-17.8% compared to seven state-of-the-art algorithms. Simulated coal mine monitoring scenarios validate our method's ability to balance performance and computational demands, facilitating real-time enhancement and supporting safer mining operations.
Related papers
- Brightness Perceiving for Recursive Low-Light Image Enhancement [8.926230015423624]
We propose a brightness-perceiving-based framework for high dynamic range low-light image enhancement.
Our framework consists of two parallel sub-networks: Adaptive Contrast and Texture enhancement network (ACT-Net) and Brightness Perception network (BP-Net)
Compared with eleven existing representative methods, the proposed method achieves new SOTA performance on six reference and no reference metrics.
arXiv Detail & Related papers (2025-04-03T07:53:33Z) - Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image Enhancement [50.93686436282772]
We aim to delve into the limits of image enhancers both from visual quality and computational efficiency.<n>By rethinking the task demands, we build an explicit connection, i.e., visual quality and computational efficiency are corresponding to model learning and structure design.<n>Ultimately, this achieves efficient low-light image enhancement using only a single convolutional layer, while maintaining excellent visual quality.
arXiv Detail & Related papers (2025-02-27T08:20:03Z) - Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration [59.744840744491945]
We reformulate the trajectory optimization of this kind of method, focusing on enhancing both reconstruction quality and efficiency.
We propose cost-aware trajectory distillation to streamline complex paths into several manageable steps with adaptable sizes.
Experiments showcase the significant superiority of the proposed method, achieving a maximum PSNR improvement of 2.1 dB over state-of-the-art methods.
arXiv Detail & Related papers (2024-10-07T07:46:08Z) - Rethinking the Atmospheric Scattering-driven Attention via Channel and Gamma Correction Priors for Low-Light Image Enhancement [0.0]
We introduce an extended version of the Channel-Prior and Gamma-Estimation Network (CPGA-Net)<n>CPGA-Net+ incorporates an attention mechanism driven by a reformulated Atmospheric Scattering Model.<n>It effectively addresses both global and local image processing through Plug-in Attention with gamma correction.
arXiv Detail & Related papers (2024-09-09T01:50:01Z) - A Lightweight GAN-Based Image Fusion Algorithm for Visible and Infrared Images [4.473596922028091]
This paper presents a lightweight image fusion algorithm specifically designed for merging visible light and infrared images.
The proposed method enhances the generator in a Generative Adversarial Network (GAN) by integrating the Convolutional Block Attention Module.
Experiments using the M3FD dataset demonstrate that the proposed algorithm outperforms similar image fusion methods in terms of fusion quality.
arXiv Detail & Related papers (2024-09-07T18:04:39Z) - Unveiling Advanced Frequency Disentanglement Paradigm for Low-Light Image Enhancement [61.22119364400268]
We propose a novel low-frequency consistency method, facilitating improved frequency disentanglement optimization.
Noteworthy improvements are showcased across five popular benchmarks, with up to 7.68dB gains on PSNR achieved for six state-of-the-art models.
Our approach maintains efficiency with only 88K extra parameters, setting a new standard in the challenging realm of low-light image enhancement.
arXiv Detail & Related papers (2024-09-03T06:19:03Z) - A Non-Uniform Low-Light Image Enhancement Method with Multi-Scale
Attention Transformer and Luminance Consistency Loss [11.585269110131659]
Low-light image enhancement aims to improve the perception of images collected in dim environments.
Existing methods cannot adaptively extract the differentiated luminance information, which will easily cause over-exposure and under-exposure.
We propose a multi-scale attention Transformer named MSATr, which sufficiently extracts local and global features for light balance to improve the visual quality.
arXiv Detail & Related papers (2023-12-27T10:07:11Z) - Advancing Unsupervised Low-light Image Enhancement: Noise Estimation, Illumination Interpolation, and Self-Regulation [55.07472635587852]
Low-Light Image Enhancement (LLIE) techniques have made notable advancements in preserving image details and enhancing contrast.
These approaches encounter persistent challenges in efficiently mitigating dynamic noise and accommodating diverse low-light scenarios.
We first propose a method for estimating the noise level in low light images in a quick and accurate way.
We then devise a Learnable Illumination Interpolator (LII) to satisfy general constraints between illumination and input.
arXiv Detail & Related papers (2023-05-17T13:56:48Z) - Lightweight HDR Camera ISP for Robust Perception in Dynamic Illumination
Conditions via Fourier Adversarial Networks [35.532434169432776]
We propose a lightweight two-stage image enhancement algorithm sequentially balancing illumination and noise removal.
We also propose a Fourier spectrum-based adversarial framework (AFNet) for consistent image enhancement under varying illumination conditions.
Based on quantitative and qualitative evaluations, we also examine the practicality and effects of image enhancement techniques on the performance of common perception tasks.
arXiv Detail & Related papers (2022-04-04T18:48:51Z) - Image-specific Convolutional Kernel Modulation for Single Image
Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM)
We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels.
Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z) - Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years.
We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.
Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.