Encoder-minimal and Decoder-minimal Framework for Remote Sensing Image
Dehazing
- URL: http://arxiv.org/abs/2312.07849v1
- Date: Wed, 13 Dec 2023 02:35:02 GMT
- Title: Encoder-minimal and Decoder-minimal Framework for Remote Sensing Image
Dehazing
- Authors: Yuanbo Wen, Tao Gao, Ziqi Li, Jing Zhang, Ting Chen
- Abstract summary: RSHazeNet is an encoder-minimal and decoder-minimal framework for efficient remote sensing image dehazing.
We develop an innovative module called intra-level transposed fusion module (ITFM)
- Score: 13.759978932686519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Haze obscures remote sensing images, hindering valuable information
extraction. To this end, we propose RSHazeNet, an encoder-minimal and
decoder-minimal framework for efficient remote sensing image dehazing.
Specifically, regarding the process of merging features within the same level,
we develop an innovative module called intra-level transposed fusion module
(ITFM). This module employs adaptive transposed self-attention to capture
comprehensive context-aware information, facilitating the robust context-aware
feature fusion. Meanwhile, we present a cross-level multi-view interaction
module (CMIM) to enable effective interactions between features from various
levels, mitigating the loss of information due to the repeated sampling
operations. In addition, we propose a multi-view progressive extraction block
(MPEB) that partitions the features into four distinct components and employs
convolution with varying kernel sizes, groups, and dilation factors to
facilitate view-progressive feature learning. Extensive experiments demonstrate
the superiority of our proposed RSHazeNet. We release the source code and all
pre-trained models at \url{https://github.com/chdwyb/RSHazeNet}.
Related papers
- FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.
We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF)
PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models.
FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - Multimodal-Aware Fusion Network for Referring Remote Sensing Image Segmentation [7.992331117310217]
Referring remote sensing image segmentation (RRSIS) is a novel visual task in remote sensing images segmentation.
We design a multimodal-aware fusion network (MAFN) to achieve fine-grained alignment and fusion between the two modalities.
arXiv Detail & Related papers (2025-03-14T08:31:21Z) - UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration [4.068692674719378]
Complicated image registration is a key issue in medical image analysis.
We propose a novel unsupervised image registration method named the unified Transformer and superresolution (UTSRMorph) network.
arXiv Detail & Related papers (2024-10-27T06:28:43Z) - Local-to-Global Cross-Modal Attention-Aware Fusion for HSI-X Semantic Segmentation [19.461033552684576]
We propose a Local-to-Global Cross-modal Attention-aware Fusion (LoGoCAF) framework for HSI-X classification.
LoGoCAF adopts a pixel-to-pixel two-branch semantic segmentation architecture to learn information from HSI and X modalities.
arXiv Detail & Related papers (2024-06-25T16:12:20Z) - A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion [41.34335755315773]
Multi-modality image fusion aims at fusing specific-modality and shared-modality information from two source images.
We propose a three-branch encoder-decoder architecture along with corresponding fusion layers as the fusion strategy.
Our method has obtained competitive results compared with state-of-the-art methods in visible/infrared image fusion and medical image fusion tasks.
arXiv Detail & Related papers (2024-06-11T09:32:40Z) - Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution [12.066710423371559]
We propose an efficient Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution (MFFSSR)
MFFSSR utilizes the Hybrid Attention Feature Extraction Block (HAFEB) to extract multi-level intra-view features.
We achieve superior performance with fewer parameters.
arXiv Detail & Related papers (2024-05-09T02:01:51Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.