Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing
- URL: http://arxiv.org/abs/2308.06998v1
- Date: Mon, 14 Aug 2023 08:23:58 GMT
- Title: Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing
- Authors: Hao Shen, Zhong-Qiu Zhao, Yulun Zhang, Zhao Zhang
- Abstract summary: We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
- Score: 54.168567276280505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-stage architectures have exhibited efficacy in image dehazing, which
usually decomposes a challenging task into multiple more tractable sub-tasks
and progressively estimates latent hazy-free images. Despite the remarkable
progress, existing methods still suffer from the following shortcomings: (1)
limited exploration of frequency domain information; (2) insufficient
information interaction; (3) severe feature redundancy. To remedy these issues,
we propose a novel Mutual Information-driven Triple interaction Network
(MITNet) based on spatial-frequency dual domain information and two-stage
architecture. To be specific, the first stage, named amplitude-guided haze
removal, aims to recover the amplitude spectrum of the hazy images for haze
removal. And the second stage, named phase-guided structure refined, devotes to
learning the transformation and refinement of the phase spectrum. To facilitate
the information exchange between two stages, an Adaptive Triple Interaction
Module (ATIM) is developed to simultaneously aggregate cross-domain,
cross-scale, and cross-stage features, where the fused features are further
used to generate content-adaptive dynamic filters so that applying them to
enhance global context representation. In addition, we impose the mutual
information minimization constraint on paired scale encoder and decoder
features from both stages. Such an operation can effectively reduce information
redundancy and enhance cross-stage feature complementarity. Extensive
experiments on multiple public datasets exhibit that our MITNet performs
superior performance with lower model complexity.The code and models are
available at https://github.com/it-hao/MITNet.
Related papers
- Addressing Domain Discrepancy: A Dual-branch Collaborative Model to Unsupervised Dehazing [1.6624384368855527]
This paper proposes a novel dual-branch collaborative unpaired dehazing model (DCM-dehaze) to address this issue.
Specifically, we design a dual depthwise separable convolutional module (DDSCM) to enhance the information of deeper features.
In addition, we construct a bidirectional contour function to optimize the edge features of the image to enhance the clarity and fidelity of the image details.
arXiv Detail & Related papers (2024-07-14T14:47:32Z) - Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement [49.15531684596958]
We propose a Dual-Domain Feature Fusion Network (DFFN) for low-light remote sensing image enhancement.
The first phase learns amplitude information to restore image brightness, and the second phase learns phase information to refine details.
We have constructed two dark light remote sensing datasets to address the current lack of datasets in dark light remote sensing image enhancement.
arXiv Detail & Related papers (2024-04-26T13:21:31Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - A Dual Domain Multi-exposure Image Fusion Network based on the
Spatial-Frequency Integration [57.14745782076976]
Multi-exposure image fusion aims to generate a single high-dynamic image by integrating images with different exposures.
We propose a novelty perspective on multi-exposure image fusion via the Spatial-Frequency Integration Framework, named MEF-SFI.
Our method achieves visual-appealing fusion results against state-of-the-art multi-exposure image fusion approaches.
arXiv Detail & Related papers (2023-12-17T04:45:15Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Decomposing and Coupling Saliency Map for Lesion Segmentation in
Ultrasound Images [10.423431415758655]
Complex scenario of ultrasound image, in which adjacent tissues share similar intensity with and even contain richer texture patterns, brings a unique challenge for accurate lesion segmentation.
This work presents a decomposition-coupling network, called DC-Net, to deal with this challenge in a (foreground-background) saliency map disentanglement-fusion manner.
The proposed method is evaluated on two ultrasound lesion segmentation tasks, which demonstrates the remarkable performance improvement over existing state-of-the-art methods.
arXiv Detail & Related papers (2023-08-02T05:02:30Z) - SufrinNet: Toward Sufficient Cross-View Interaction for Stereo Image
Enhancement in The Dark [119.01585302856103]
Low-light stereo image enhancement (LLSIE) is a relatively new task to enhance the quality of visually unpleasant stereo images captured in dark conditions.
Current methods clearly suffer from two shortages: 1) insufficient cross-view interaction; 2) lacking long-range dependency for intra-view learning.
We propose a novel LLSIE model, termed underlineSufficient Cunderlineross-View underlineInteraction Network (SufrinNet)
arXiv Detail & Related papers (2022-11-02T04:01:30Z) - GridDehazeNet+: An Enhanced Multi-Scale Network with Intra-Task
Knowledge Transfer for Single Image Dehazing [12.982905875008214]
We propose an enhanced multi-scale network, dubbed GridDehazeNet+, for single image dehazing.
It consists of three modules: pre-processing, backbone, and post-processing.
arXiv Detail & Related papers (2021-03-25T17:35:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.