X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data
- URL: http://arxiv.org/abs/2006.13806v2
- Date: Sat, 11 Jul 2020 18:26:47 GMT
- Title: X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data
- Authors: Danfeng Hong, Naoto Yokoya, Gui-Song Xia, Jocelyn Chanussot, Xiao
Xiang Zhu
- Abstract summary: We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
- Score: 69.37597254841052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the problem of semi-supervised transfer learning with
limited cross-modality data in remote sensing. A large amount of multi-modal
earth observation images, such as multispectral imagery (MSI) or synthetic
aperture radar (SAR) data, are openly available on a global scale, enabling
parsing global urban scenes through remote sensing imagery. However, their
ability in identifying materials (pixel-wise classification) remains limited,
due to the noisy collection environment and poor discriminative information as
well as limited number of well-annotated training images. To this end, we
propose a novel cross-modal deep-learning framework, called X-ModalNet, with
three well-designed modules: self-adversarial module, interactive learning
module, and label propagation module, by learning to transfer more
discriminative information from a small-scale hyperspectral image (HSI) into
the classification task using a large-scale MSI or SAR data. Significantly,
X-ModalNet generalizes well, owing to propagating labels on an updatable graph
constructed by high-level features on the top of the network, yielding
semi-supervised cross-modality learning. We evaluate X-ModalNet on two
multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a
significant improvement in comparison with several state-of-the-art methods.
Related papers
- Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks.
MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization.
MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z) - SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection [73.49799596304418]
This paper introduces a new task called Multi-Modal datasets and Multi-Task Object Detection (M2Det) for remote sensing.
It is designed to accurately detect horizontal or oriented objects from any sensor modality.
This task poses challenges due to 1) the trade-offs involved in managing multi-modal modelling and 2) the complexities of multi-task optimization.
arXiv Detail & Related papers (2024-12-30T02:47:51Z) - Remote Sensing Image Segmentation Using Vision Mamba and Multi-Scale Multi-Frequency Feature Fusion [9.098711843118629]
This paper introduces state space model (SSM) and proposes a novel hybrid semantic segmentation network based on vision Mamba (CVMH-UNet)
This method designs a cross-scanning visual state space block (CVSSBlock) that uses cross 2D scanning (CS2D) to fully capture global information from multiple directions.
By incorporating convolutional neural network branches to overcome the constraints of Vision Mamba (VMamba) in acquiring local information, this approach facilitates a comprehensive analysis of both global and local features.
arXiv Detail & Related papers (2024-10-08T02:17:38Z) - SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion
Constraint for Remote Sensing Image [21.596874679058327]
SwiMDiff is a novel self-supervised pre-training framework for remote sensing images.
It recalibrates labels to recognize data from the same scene as false negatives.
It seamlessly integrates contrastive learning (CL) with a diffusion model.
arXiv Detail & Related papers (2024-01-10T11:55:58Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Multi-Spectral Image Classification with Ultra-Lean Complex-Valued
Models [28.798100220715686]
Multi-spectral imagery is invaluable for remote sensing due to different spectral signatures exhibited by materials.
We apply complex-valued co-domain symmetric models to classify real-valued MSI images.
Our work is the first to demonstrate the value of complex-valued deep learning on real-valued MSI data.
arXiv Detail & Related papers (2022-11-21T19:01:53Z) - Remote Sensing Image Scene Classification with Self-Supervised Paradigm
under Limited Labeled Samples [11.025191332244919]
We introduce new self-supervised learning (SSL) mechanism to obtain the high-performance pre-training model for RSIs scene classification from large unlabeled data.
Experiments on three commonly used RSIs scene classification datasets demonstrated that this new learning paradigm outperforms the traditional dominant ImageNet pre-trained model.
The insights distilled from our studies can help to foster the development of SSL in the remote sensing community.
arXiv Detail & Related papers (2020-10-02T09:27:19Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - Crowd Counting via Hierarchical Scale Recalibration Network [61.09833400167511]
We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
arXiv Detail & Related papers (2020-03-07T10:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.