Multi-Knowledge-oriented Nighttime Haze Imaging Enhancer for Vision-driven Intelligent Systems
- URL: http://arxiv.org/abs/2502.07351v4
- Date: Mon, 16 Jun 2025 05:32:08 GMT
- Title: Multi-Knowledge-oriented Nighttime Haze Imaging Enhancer for Vision-driven Intelligent Systems
- Authors: Ai Chen, Yuxu Lu, Dong Yang, Junlin Zhou, Yan Fu, Duanbing Chen,
- Abstract summary: Adverse imaging conditions such as haze severely degrade image quality.<n>We propose a multi-knowledge-oriented nighttime haze imaging enhancer (MKoIE)<n>MKoIE integrates three tasks: daytime dehazing, low-light enhancement, and nighttime dehazing.
- Score: 4.742689734374541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Salient object detection (SOD) plays a critical role in Intelligent Imaging, facilitating the detection and segmentation of key visual elements in an image. However, adverse imaging conditions such as haze during the day, low light, and haze at night severely degrade image quality and hinder reliable object detection in real-world scenarios. To address these challenges, we propose a multi-knowledge-oriented nighttime haze imaging enhancer (MKoIE), which integrates three tasks: daytime dehazing, low-light enhancement, and nighttime dehazing. The MKoIE incorporates two key innovative components: First, the network employs a task-oriented node learning mechanism to handle three specific degradation types: day-time haze, low light, and night-time haze conditions, with an embedded self-attention module enhancing its performance in nighttime imaging. In addition, multi-receptive field enhancement module that efficiently extracts multi-scale features through three parallel depthwise separable convolution branches with different dilation rates, capturing comprehensive spatial information with minimal computational overhead to meet the requirements of real-time imaging deployment. To ensure optimal image reconstruction quality and visual characteristics, we suggest a hybrid loss function. Extensive experiments on different types of weather/imaging conditions illustrate that MKoIE surpasses existing methods, enhancing the reliability, accuracy, and operational efficiency of intelligent imaging.
Related papers
- DFVO: Learning Darkness-free Visible and Infrared Image Disentanglement and Fusion All at Once [57.15043822199561]
A Darkness-Free network is proposed to handle Visible and infrared image disentanglement and fusion all at Once (DFVO)<n>DFVO employs a cascaded multi-task approach to replace the traditional two-stage cascaded training (enhancement and fusion)<n>Our proposed approach outperforms state-of-the-art alternatives in terms of qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-05-07T15:59:45Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.
We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF)
PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models.
FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion [58.36400052566673]
Infrared and visible image fusion integrates information from distinct spectral bands to enhance image quality.
Existing approaches treat image fusion and subsequent high-level tasks as separate processes.
We propose a Discriminative Cross- Dimension Evolutionary Learning Framework, termed DCEvo, which simultaneously enhances visual quality and perception accuracy.
arXiv Detail & Related papers (2025-03-22T07:01:58Z) - IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations [64.07859467542664]
Capturing geometric and material information from images remains a fundamental challenge in computer vision and graphics.<n>Traditional optimization-based methods often require hours of computational time to reconstruct geometry, material properties, and environmental lighting from dense multi-view inputs.<n>We introduce IDArb, a diffusion-based model designed to perform intrinsic decomposition on an arbitrary number of images under varying illuminations.
arXiv Detail & Related papers (2024-12-16T18:52:56Z) - Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation [58.180226179087086]
We propose a novel end-to-end optimized approach, named NightFormer, tailored for night-time semantic segmentation.
Specifically, we design a pixel-level texture enhancement module to acquire texture-aware features hierarchically with phase enhancement and amplified attention.
Our proposed method performs favorably against state-of-the-art night-time semantic segmentation methods.
arXiv Detail & Related papers (2024-08-25T13:59:31Z) - MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection [28.319440934322728]
MV2DFusion is a multi-modal detection framework that integrates the strengths of both worlds through an advanced query-based fusion mechanism.
Our framework's flexibility allows it to integrate with any image and point cloud-based detectors, showcasing its adaptability and potential for future advancements.
arXiv Detail & Related papers (2024-08-12T06:46:05Z) - MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection [9.780498146964097]
We propose an innovative network architecture, MonoMM, for real-time monocular 3D object detection.
MonoMM consists of Focused Multi-Scale Fusion (FMF) and Depth-Aware Feature Enhancement Mamba (DMB) modules.
Our method outperforms previous monocular methods and achieves real-time detection.
arXiv Detail & Related papers (2024-08-01T10:16:58Z) - Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving [45.97279394690308]
LightDiff is a framework designed to enhance the low-light image quality for autonomous driving applications.
It incorporates a novel multi-condition adapter that adaptively controls the input weights from different modalities, including depth maps, RGB images, and text captions.
It can significantly improve the performance of several state-of-the-art 3D detectors in night-time conditions while achieving high visual quality scores.
arXiv Detail & Related papers (2024-04-07T04:10:06Z) - A Non-Uniform Low-Light Image Enhancement Method with Multi-Scale
Attention Transformer and Luminance Consistency Loss [11.585269110131659]
Low-light image enhancement aims to improve the perception of images collected in dim environments.
Existing methods cannot adaptively extract the differentiated luminance information, which will easily cause over-exposure and under-exposure.
We propose a multi-scale attention Transformer named MSATr, which sufficiently extracts local and global features for light balance to improve the visual quality.
arXiv Detail & Related papers (2023-12-27T10:07:11Z) - Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model [83.85856356798531]
VistaLLM is a visual system that addresses coarse- and fine-grained vision-language tasks.
It employs a gradient-aware adaptive sampling technique to represent binary segmentation masks as sequences.
We also introduce a novel task, AttCoSeg, which boosts the model's reasoning and grounding capability over multiple input images.
arXiv Detail & Related papers (2023-12-19T18:53:01Z) - Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for
Loss-free Multi-Exposure Image Fusion [60.221404321514086]
Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels.
This paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions.
arXiv Detail & Related papers (2023-09-03T08:07:26Z) - LiDAR-BEVMTN: Real-Time LiDAR Bird's-Eye View Multi-Task Perception Network for Autonomous Driving [12.713417063678335]
We present a real-time multi-task convolutional neural network for LiDAR-based object detection, semantics, and motion segmentation.
We propose a novel Semantic Weighting and Guidance (SWAG) module to transfer semantic features for improved object detection selectively.
We achieve state-of-the-art results for two tasks, semantic and motion segmentation, and close to state-of-the-art performance for 3D object detection.
arXiv Detail & Related papers (2023-07-17T21:22:17Z) - MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in
Adverse Scenes [49.21187418886508]
This paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP.
We first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors.
Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth.
arXiv Detail & Related papers (2023-05-18T13:42:02Z) - Robust Single Image Dehazing Based on Consistent and Contrast-Assisted
Reconstruction [95.5735805072852]
We propose a novel density-variational learning framework to improve the robustness of the image dehzing model.
Specifically, the dehazing network is optimized under the consistency-regularized framework.
Our method significantly surpasses the state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-29T08:11:04Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object
Detection [56.03081616213012]
We propose EPNet++ for multi-modal 3D object detection by introducing a novel Cascade Bi-directional Fusion(CB-Fusion) module.
The proposed CB-Fusion module boosts the plentiful semantic information of point features with the image features in a cascade bi-directional interaction fusion manner.
The experiment results on the KITTI, JRDB and SUN-RGBD datasets demonstrate the superiority of EPNet++ over the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-21T10:48:34Z) - Bridge the Vision Gap from Field to Command: A Deep Learning Network
Enhancing Illumination and Details [17.25188250076639]
We propose a two-stream framework named NEID to tune up the brightness and enhance the details simultaneously.
The proposed method consists of three parts: Light Enhancement (LE), Detail Refinement (DR) and Feature Fusing (FF) module.
arXiv Detail & Related papers (2021-01-20T09:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.