SFFNet: A Wavelet-Based Spatial and Frequency Domain Fusion Network for Remote Sensing Segmentation
- URL: http://arxiv.org/abs/2405.01992v1
- Date: Fri, 3 May 2024 10:47:56 GMT
- Title: SFFNet: A Wavelet-Based Spatial and Frequency Domain Fusion Network for Remote Sensing Segmentation
- Authors: Yunsong Yang, Genji Yuan, Jinjiang Li,
- Abstract summary: We propose the SFFNet (Spatial and Frequency Domain Fusion Network) framework.
The first stage extracts features using spatial methods to obtain features with sufficient spatial details and semantic information.
The second stage maps these features in both spatial and frequency domains.
SFFNet achieves superior performance in terms of mIoU, reaching 84.80% and 87.73% respectively.
- Score: 9.22384870426709
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In order to fully utilize spatial information for segmentation and address the challenge of handling areas with significant grayscale variations in remote sensing segmentation, we propose the SFFNet (Spatial and Frequency Domain Fusion Network) framework. This framework employs a two-stage network design: the first stage extracts features using spatial methods to obtain features with sufficient spatial details and semantic information; the second stage maps these features in both spatial and frequency domains. In the frequency domain mapping, we introduce the Wavelet Transform Feature Decomposer (WTFD) structure, which decomposes features into low-frequency and high-frequency components using the Haar wavelet transform and integrates them with spatial features. To bridge the semantic gap between frequency and spatial features, and facilitate significant feature selection to promote the combination of features from different representation domains, we design the Multiscale Dual-Representation Alignment Filter (MDAF). This structure utilizes multiscale convolutions and dual-cross attentions. Comprehensive experimental results demonstrate that, compared to existing methods, SFFNet achieves superior performance in terms of mIoU, reaching 84.80% and 87.73% respectively.The code is located at https://github.com/yysdck/SFFNet.
Related papers
- MDNF: Multi-Diffusion-Nets for Neural Fields on Meshes [5.284425534494986]
We propose a novel framework for representing neural fields on triangle meshes that is multi-resolution across both spatial and frequency domains.
Inspired by the Neural Fourier Filter Bank (NFFB), our architecture decomposes the frequencies and frequency domains by associating finer resolution levels with higher frequency bands.
We demonstrate the effectiveness of our approach through its application to diverse neural fields, such as synthetic RGB functions, UV texture coordinates, and normals.
arXiv Detail & Related papers (2024-09-04T19:08:13Z) - Frequency-Spatial Entanglement Learning for Camouflaged Object Detection [34.426297468968485]
Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design.
We propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method.
Our experiments demonstrate the superiority of our FSEL over 21 state-of-the-art methods, through comprehensive quantitative and qualitative comparisons in three widely-used datasets.
arXiv Detail & Related papers (2024-09-03T07:58:47Z) - Multiple Contexts and Frequencies Aggregation Network forDeepfake Detection [5.65128683992597]
Deepfake detection faces increasing challenges since the fast growth of generative models in developing massive and diverse Deepfake technologies.
Recent advances rely on introducing features from spatial or frequency domains rather than modeling general forgery features within backbones.
We propose an efficient network for face forgery detection named MkfaNet, which consists of two core modules.
arXiv Detail & Related papers (2024-08-03T05:34:53Z) - MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection [4.165508411354963]
Event-Independent Network V2 (EINV2) has achieved outstanding performance on Sound Event localization and Detection.
This paper proposes a three-stage network structure named Multi-scale Feature Fusion (MFF) module to fully extract multi-scale features across spectral, spatial, and temporal domains.
arXiv Detail & Related papers (2024-06-13T03:03:02Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.
Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.
Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models [89.76587063609806]
We study the denoising diffusion probabilistic model (DDPM) in wavelet space, instead of pixel space, for visual synthesis.
By explicitly modeling the wavelet signals, we find our model is able to generate images with higher quality on several datasets.
arXiv Detail & Related papers (2023-07-27T06:53:16Z) - Deep Fourier Up-Sampling [100.59885545206744]
Up-sampling in the Fourier domain is more challenging as it does not follow such a local property.
We propose a theoretically sound Deep Fourier Up-Sampling (FourierUp) to solve these issues.
arXiv Detail & Related papers (2022-10-11T06:17:31Z) - SFNet: Faster and Accurate Semantic Segmentation via Semantic Flow [88.97790684009979]
A common practice to improve the performance is to attain high-resolution feature maps with strong semantic representation.
We propose a Flow Alignment Module (FAM) to learn textitSemantic Flow between feature maps of adjacent levels.
We also present a novel Gated Dual Flow Alignment Module to directly align high-resolution feature maps and low-resolution feature maps.
arXiv Detail & Related papers (2022-07-10T08:25:47Z) - Adaptive Frequency Learning in Two-branch Face Forgery Detection [66.91715092251258]
We propose Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD.
We liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers.
arXiv Detail & Related papers (2022-03-27T14:25:52Z) - Transformer-based Network for RGB-D Saliency Detection [82.6665619584628]
Key to RGB-D saliency detection is to fully mine and fuse information at multiple scales across the two modalities.
We show that transformer is a uniform operation which presents great efficacy in both feature fusion and feature enhancement.
Our proposed network performs favorably against state-of-the-art RGB-D saliency detection methods.
arXiv Detail & Related papers (2021-12-01T15:53:58Z) - Change Detection in Synthetic Aperture Radar Images Using a Dual-Domain
Network [33.50775914682585]
Change detection from synthetic aperture radar (SAR) imagery is a critical yet challenging task.
Existing methods mainly focus on feature extraction in spatial domain, and little attention has been paid to frequency domain.
We propose a Dual-Domain Network to tackle the above two challenges.
arXiv Detail & Related papers (2021-04-14T08:41:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.