Related papers: Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing

Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing

URL: http://arxiv.org/abs/2308.06998v1
Date: Mon, 14 Aug 2023 08:23:58 GMT
Title: Mutual Information-driven Triple Interaction Network for Efficient Image Dehazing
Authors: Hao Shen, Zhong-Qiu Zhao, Yulun Zhang, Zhao Zhang
Abstract summary: We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing. The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal. The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
Score: 54.168567276280505
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-stage architectures have exhibited efficacy in image dehazing, which usually decomposes a challenging task into multiple more tractable sub-tasks and progressively estimates latent hazy-free images. Despite the remarkable progress, existing methods still suffer from the following shortcomings: (1) limited exploration of frequency domain information; (2) insufficient information interaction; (3) severe feature redundancy. To remedy these issues, we propose a novel Mutual Information-driven Triple interaction Network (MITNet) based on spatial-frequency dual domain information and two-stage architecture. To be specific, the first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal. And the second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum. To facilitate the information exchange between two stages, an Adaptive Triple Interaction Module (ATIM) is developed to simultaneously aggregate cross-domain, cross-scale, and cross-stage features, where the fused features are further used to generate content-adaptive dynamic filters so that applying them to enhance global context representation. In addition, we impose the mutual information minimization constraint on paired scale encoder and decoder features from both stages. Such an operation can effectively reduce information redundancy and enhance cross-stage feature complementarity. Extensive experiments on multiple public datasets exhibit that our MITNet performs superior performance with lower model complexity.The code and models are available at https://github.com/it-hao/MITNet.

Related papers

Exploring Fourier Prior and Event Collaboration for Low-Light Image Enhancement [1.8724535169356553]
Event cameras provide performance gain for low-light image enhancement.<n>Currently, existing event-based methods feed a frame and events directly into a single model.<n>We propose a visibility restoration network with amplitude-phase entanglement.<n>In the second stage, a fusion strategy with dynamic alignment is proposed to mitigate the spatial mismatch.
arXiv Detail & Related papers (2025-08-01T04:25:00Z)
FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability. We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF) PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models. FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z)
Boosting ViT-based MRI Reconstruction from the Perspectives of Frequency Modulation, Spatial Purification, and Scale Diversification [6.341065683872316]
ViTs struggle to capture high-frequency components of images, limiting their ability to detect local textures and edge information. Previous methods calculate multi-head self-attention (MSA) among both related and unrelated tokens in content. The naive feed-forward network in ViTs cannot model the multi-scale information that is important for image restoration.
arXiv Detail & Related papers (2024-12-14T10:03:08Z)
A Hybrid Transformer-Mamba Network for Single Image Deraining [70.64069487982916]
Existing deraining Transformers employ self-attention mechanisms with fixed-range windows or along channel dimensions. We introduce a novel dual-branch hybrid Transformer-Mamba network, denoted as TransMamba, aimed at effectively capturing long-range rain-related dependencies.
arXiv Detail & Related papers (2024-08-31T10:03:19Z)
Addressing Domain Discrepancy: A Dual-branch Collaborative Model to Unsupervised Dehazing [1.6624384368855527]
This paper proposes a novel dual-branch collaborative unpaired dehazing model (DCM-dehaze) to address this issue. Specifically, we design a dual depthwise separable convolutional module (DDSCM) to enhance the information of deeper features. In addition, we construct a bidirectional contour function to optimize the edge features of the image to enhance the clarity and fidelity of the image details.
arXiv Detail & Related papers (2024-07-14T14:47:32Z)
ECAFormer: Low-light Image Enhancement using Cross Attention [11.554554006307836]
Low-light image enhancement (LLIE) is critical in computer vision. We design a hierarchical mutual Enhancement via a Cross Attention transformer (ECAFormer) We show that ECAFormer reaches competitive performance across multiple benchmarks, yielding nearly a 3% improvement in PSNR over the suboptimal method.
arXiv Detail & Related papers (2024-06-19T07:21:31Z)
Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement [49.15531684596958]
We propose a Dual-Domain Feature Fusion Network (DFFN) for low-light remote sensing image enhancement. The first phase learns amplitude information to restore image brightness, and the second phase learns phase information to refine details. We have constructed two dark light remote sensing datasets to address the current lack of datasets in dark light remote sensing image enhancement.
arXiv Detail & Related papers (2024-04-26T13:21:31Z)
ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge [63.00793292863]
ToddlerDiffusion is a novel approach to decomposing the complex task of RGB image generation into simpler, interpretable stages. Our method, termed ToddlerDiffusion, cascades modality-specific models, each responsible for generating an intermediate representation. ToddlerDiffusion consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-11-24T15:20:01Z)
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference. This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion. The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z)
SufrinNet: Toward Sufficient Cross-View Interaction for Stereo Image Enhancement in The Dark [119.01585302856103]
Low-light stereo image enhancement (LLSIE) is a relatively new task to enhance the quality of visually unpleasant stereo images captured in dark conditions. Current methods clearly suffer from two shortages: 1) insufficient cross-view interaction; 2) lacking long-range dependency for intra-view learning. We propose a novel LLSIE model, termed underlineSufficient Cunderlineross-View underlineInteraction Network (SufrinNet)
arXiv Detail & Related papers (2022-11-02T04:01:30Z)
GridDehazeNet+: An Enhanced Multi-Scale Network with Intra-Task Knowledge Transfer for Single Image Dehazing [12.982905875008214]
We propose an enhanced multi-scale network, dubbed GridDehazeNet+, for single image dehazing. It consists of three modules: pre-processing, backbone, and post-processing.
arXiv Detail & Related papers (2021-03-25T17:35:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.