MDFL: Multi-domain Diffusion-driven Feature Learning
- URL: http://arxiv.org/abs/2311.09520v1
- Date: Thu, 16 Nov 2023 02:55:21 GMT
- Title: MDFL: Multi-domain Diffusion-driven Feature Learning
- Authors: Daixun Li, Weiying Xie, Jiaqing Zhang, Yunsong Li
- Abstract summary: We present a multi-domain diffusion-driven feature learning network (MDFL)
MDFL redefines the effective information domain that the model really focuses on.
We demonstrate that MDFL significantly improves the feature extraction performance of high-dimensional data.
- Score: 19.298491870280213
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-dimensional images, known for their rich semantic information, are
widely applied in remote sensing and other fields. The spatial information in
these images reflects the object's texture features, while the spectral
information reveals the potential spectral representations across different
bands. Currently, the understanding of high-dimensional images remains limited
to a single-domain perspective with performance degradation. Motivated by the
masking texture effect observed in the human visual system, we present a
multi-domain diffusion-driven feature learning network (MDFL) , a scheme to
redefine the effective information domain that the model really focuses on.
This method employs diffusion-based posterior sampling to explicitly consider
joint information interactions between the high-dimensional manifold structures
in the spectral, spatial, and frequency domains, thereby eliminating the
influence of masking texture effects in visual models. Additionally, we
introduce a feature reuse mechanism to gather deep and raw features of
high-dimensional data. We demonstrate that MDFL significantly improves the
feature extraction performance of high-dimensional data, thereby providing a
powerful aid for revealing the intrinsic patterns and structures of such data.
The experimental results on three multi-modal remote sensing datasets show that
MDFL reaches an average overall accuracy of 98.25%, outperforming various
state-of-the-art baseline schemes. The code will be released, contributing to
the computer vision community.
Related papers
- Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods.
Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z) - Multi-view Aggregation Network for Dichotomous Image Segmentation [76.75904424539543]
Dichotomous Image (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images.
Existing methods rely on tedious multiple encoder-decoder streams and stages to gradually complete the global localization and local refinement.
Inspired by it, we model DIS as a multi-view object perception problem and provide a parsimonious multi-view aggregation network (MVANet)
Experiments on the popular DIS-5K dataset show that our MVANet significantly outperforms state-of-the-art methods in both accuracy and speed.
arXiv Detail & Related papers (2024-04-11T03:00:00Z) - Rethinking Superpixel Segmentation from Biologically Inspired Mechanisms [8.24963839394421]
We propose a network architecture comprising an Enhanced Screening Module (ESM) and a novel Boundary-Aware Label (BAL) for superpixel segmentation.
The ESM enhances semantic information by simulating the interactive projection mechanisms of the visual cortex.
The BAL emulates the spatial frequency characteristics of visual cortical cells to facilitate the generation of superpixels with strong boundary adherence.
arXiv Detail & Related papers (2023-09-23T17:29:38Z) - Fusion of Infrared and Visible Images based on Spatial-Channel
Attentional Mechanism [3.388001684915793]
We present AMFusionNet, an innovative approach to infrared and visible image fusion (IVIF)
By assimilating thermal details from infrared images with texture features from visible sources, our method produces images enriched with comprehensive information.
Our method outperforms state-of-the-art algorithms in terms of quality and quantity.
arXiv Detail & Related papers (2023-08-25T21:05:11Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - Inertial Sensor Data To Image Encoding For Human Action Recognition [0.0]
Convolutional Neural Networks (CNNs) are successful deep learning models in the field of computer vision.
In this paper, we use 4 types of spatial domain methods for transforming inertial sensor data to activity images.
For creating a multimodal fusion framework, we made each type of activity images multimodal by convolving with two spatial domain filters.
arXiv Detail & Related papers (2021-05-28T01:22:52Z) - Spatial--spectral FFPNet: Attention-Based Pyramid Network for
Segmentation and Classification of Remote Sensing Images [12.320585790097415]
In this study, we develop an attention-based pyramid network for segmentation and classification of remote sensing datasets.
Experiments conducted on ISPRS Vaihingen and ISPRS Potsdam high-resolution datasets demonstrate the competitive segmentation accuracy achieved by the proposed heavy-weight spatial FFPNet.
arXiv Detail & Related papers (2020-08-20T04:55:34Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.