Multimodal Remote Sensing Benchmark Datasets for Land Cover
Classification with A Shared and Specific Feature Learning Model
- URL: http://arxiv.org/abs/2105.10196v1
- Date: Fri, 21 May 2021 08:14:21 GMT
- Title: Multimodal Remote Sensing Benchmark Datasets for Land Cover
Classification with A Shared and Specific Feature Learning Model
- Authors: Danfeng Hong and Jingliang Hu and Jing Yao and Jocelyn Chanussot and
Xiao Xiang Zhu
- Abstract summary: We propose a shared and specific feature learning (S2FL) model to decomposing multimodal RS data into modality-shared and modality-specific components.
To better assess multimodal baselines and the newly-proposed S2FL model, three multimodal RS benchmark datasets, i.e., Houston2013 -- hyperspectral and multispectral data, Berlin -- hyperspectral and synthetic aperture radar (SAR) data, Augsburg -- hyperspectral, SAR, and digital surface model (DSM) data, are released and used for land cover classification.
- Score: 36.993630058695345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As remote sensing (RS) data obtained from different sensors become available
largely and openly, multimodal data processing and analysis techniques have
been garnering increasing interest in the RS and geoscience community. However,
due to the gap between different modalities in terms of imaging sensors,
resolutions, and contents, embedding their complementary information into a
consistent, compact, accurate, and discriminative representation, to a great
extent, remains challenging. To this end, we propose a shared and specific
feature learning (S2FL) model. S2FL is capable of decomposing multimodal RS
data into modality-shared and modality-specific components, enabling the
information blending of multi-modalities more effectively, particularly for
heterogeneous data sources. Moreover, to better assess multimodal baselines and
the newly-proposed S2FL model, three multimodal RS benchmark datasets, i.e.,
Houston2013 -- hyperspectral and multispectral data, Berlin -- hyperspectral
and synthetic aperture radar (SAR) data, Augsburg -- hyperspectral, SAR, and
digital surface model (DSM) data, are released and used for land cover
classification. Extensive experiments conducted on the three datasets
demonstrate the superiority and advancement of our S2FL model in the task of
land cover classification in comparison with previously-proposed
state-of-the-art baselines. Furthermore, the baseline codes and datasets used
in this paper will be made available freely at
https://github.com/danfenghong/ISPRS_S2FL.
Related papers
- Multi-Space Alignments Towards Universal LiDAR Segmentation [50.992103482269016]
M3Net is a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation.
We first combine large-scale driving datasets acquired by different types of sensors from diverse scenes.
We then conduct alignments in three spaces, namely data, feature, and label spaces, during the training.
arXiv Detail & Related papers (2024-05-02T17:59:57Z) - GAMUS: A Geometry-aware Multi-modal Semantic Segmentation Benchmark for
Remote Sensing Data [27.63411386396492]
This paper introduces a new benchmark dataset for multi-modal semantic segmentation based on RGB-Height (RGB-H) data.
The proposed benchmark consists of 1) a large-scale dataset including co-registered RGB and nDSM pairs and pixel-wise semantic labels; 2) a comprehensive evaluation and analysis of existing multi-modal fusion strategies for both convolutional and Transformer-based networks on remote sensing data.
arXiv Detail & Related papers (2023-05-24T09:03:18Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Multi-Stage Based Feature Fusion of Multi-Modal Data for Human Activity
Recognition [6.0306313759213275]
We propose a multi-modal framework that learns to effectively combine features from RGB Video and IMU sensors.
Our model is trained in two-stage, where in the first stage, each input encoder learns to effectively extract features.
We show significant improvements of 22% and 11% compared to video only, and 20% and 12% on MMAct datasets.
arXiv Detail & Related papers (2022-11-08T15:48:44Z) - Shared Manifold Learning Using a Triplet Network for Multiple Sensor
Translation and Fusion with Missing Data [2.452410403088629]
We propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to align data from different sensors into a shared and discriminative manifold.
The proposed architecture uses a multimodal triplet autoencoder to cluster the latent space in such a way that samples of the same classes from each heterogeneous modality are mapped close to each other.
arXiv Detail & Related papers (2022-10-25T20:22:09Z) - A3CLNN: Spatial, Spectral and Multiscale Attention ConvLSTM Neural
Network for Multisource Remote Sensing Data Classification [24.006660419933727]
We propose a new approach to exploit the complement of two data sources: hyperspectral images (HSIs) and light detection and ranging (LiDAR) data.
We develop a new dual-channel spatial, spectral and multiscale attention convolutional long short-term memory neural network (called dual-channel A3CLNN) for feature extraction and classification.
arXiv Detail & Related papers (2022-04-09T12:43:32Z) - Perception-aware Multi-sensor Fusion for 3D LiDAR Semantic Segmentation [59.42262859654698]
3D semantic segmentation is important in scene understanding for many applications, such as auto-driving and robotics.
Existing fusion-based methods may not achieve promising performance due to vast difference between two modalities.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer
Learning [67.40866334083941]
We propose an end-to-end 3-D lightweight convolutional neural network (CNN) for limited samples-based HSI classification.
Compared with conventional 3-D-CNN models, the proposed 3-D-LWNet has a deeper network structure, less parameters, and lower computation cost.
Our model achieves competitive performance for HSI classification compared to several state-of-the-art methods.
arXiv Detail & Related papers (2020-12-07T03:44:35Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z) - Weakly-supervised land classification for coastal zone based on deep
convolutional neural networks by incorporating dual-polarimetric
characteristics into training dataset [1.125851164829582]
We explore the performance of DCNNs on semantic segmentation using spaceborne polarimetric synthetic aperture radar (PolSAR) datasets.
The semantic segmentation task using PolSAR data can be categorized as weakly supervised learning when the characteristics of SAR data and data annotating procedures are factored in.
Three DCNN models, including SegNet, U-Net, and LinkNet, are implemented next.
arXiv Detail & Related papers (2020-03-30T17:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.