Multimodal Remote Sensing Benchmark Datasets for Land Cover
Classification with A Shared and Specific Feature Learning Model
- URL: http://arxiv.org/abs/2105.10196v1
- Date: Fri, 21 May 2021 08:14:21 GMT
- Title: Multimodal Remote Sensing Benchmark Datasets for Land Cover
Classification with A Shared and Specific Feature Learning Model
- Authors: Danfeng Hong and Jingliang Hu and Jing Yao and Jocelyn Chanussot and
Xiao Xiang Zhu
- Abstract summary: We propose a shared and specific feature learning (S2FL) model to decomposing multimodal RS data into modality-shared and modality-specific components.
To better assess multimodal baselines and the newly-proposed S2FL model, three multimodal RS benchmark datasets, i.e., Houston2013 -- hyperspectral and multispectral data, Berlin -- hyperspectral and synthetic aperture radar (SAR) data, Augsburg -- hyperspectral, SAR, and digital surface model (DSM) data, are released and used for land cover classification.
- Score: 36.993630058695345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As remote sensing (RS) data obtained from different sensors become available
largely and openly, multimodal data processing and analysis techniques have
been garnering increasing interest in the RS and geoscience community. However,
due to the gap between different modalities in terms of imaging sensors,
resolutions, and contents, embedding their complementary information into a
consistent, compact, accurate, and discriminative representation, to a great
extent, remains challenging. To this end, we propose a shared and specific
feature learning (S2FL) model. S2FL is capable of decomposing multimodal RS
data into modality-shared and modality-specific components, enabling the
information blending of multi-modalities more effectively, particularly for
heterogeneous data sources. Moreover, to better assess multimodal baselines and
the newly-proposed S2FL model, three multimodal RS benchmark datasets, i.e.,
Houston2013 -- hyperspectral and multispectral data, Berlin -- hyperspectral
and synthetic aperture radar (SAR) data, Augsburg -- hyperspectral, SAR, and
digital surface model (DSM) data, are released and used for land cover
classification. Extensive experiments conducted on the three datasets
demonstrate the superiority and advancement of our S2FL model in the task of
land cover classification in comparison with previously-proposed
state-of-the-art baselines. Furthermore, the baseline codes and datasets used
in this paper will be made available freely at
https://github.com/danfenghong/ISPRS_S2FL.
Related papers
- SpecSAR-Former: A Lightweight Transformer-based Network for Global LULC Mapping Using Integrated Sentinel-1 and Sentinel-2 [13.17346252861919]
We introduce the Dynamic World+ dataset, expanding the current authoritative multispectral dataset, Dynamic World.
To facilitate the combination of multispectral and SAR data, we propose a lightweight transformer architecture termed SpecSAR-Former.
Our network outperforms existing transformer and CNN-based models, achieving a mean Intersection over Union (mIoU) of 59.58%, an Overall Accuracy (OA) of 79.48%, and an F1 Score of 71.68% with only 26.70M parameters.
arXiv Detail & Related papers (2024-10-04T22:53:25Z) - Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection [64.08296187555095]
Uni$2$Det is a framework for unified and universal multi-dataset training on 3D detection.
We introduce multi-stage prompting modules for multi-dataset 3D detection.
Results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.
arXiv Detail & Related papers (2024-09-30T17:57:50Z) - A Framework for Fine-Tuning LLMs using Heterogeneous Feedback [69.51729152929413]
We present a framework for fine-tuning large language models (LLMs) using heterogeneous feedback.
First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF.
Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases.
arXiv Detail & Related papers (2024-08-05T23:20:32Z) - MergeOcc: Bridge the Domain Gap between Different LiDARs for Robust Occupancy Prediction [8.993992124170624]
MergeOcc is developed to simultaneously handle different LiDARs by leveraging multiple datasets.
The effectiveness of MergeOcc is validated through experiments on two prominent datasets for autonomous vehicles.
arXiv Detail & Related papers (2024-03-13T13:23:05Z) - GAMUS: A Geometry-aware Multi-modal Semantic Segmentation Benchmark for
Remote Sensing Data [27.63411386396492]
This paper introduces a new benchmark dataset for multi-modal semantic segmentation based on RGB-Height (RGB-H) data.
The proposed benchmark consists of 1) a large-scale dataset including co-registered RGB and nDSM pairs and pixel-wise semantic labels; 2) a comprehensive evaluation and analysis of existing multi-modal fusion strategies for both convolutional and Transformer-based networks on remote sensing data.
arXiv Detail & Related papers (2023-05-24T09:03:18Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Shared Manifold Learning Using a Triplet Network for Multiple Sensor
Translation and Fusion with Missing Data [2.452410403088629]
We propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to align data from different sensors into a shared and discriminative manifold.
The proposed architecture uses a multimodal triplet autoencoder to cluster the latent space in such a way that samples of the same classes from each heterogeneous modality are mapped close to each other.
arXiv Detail & Related papers (2022-10-25T20:22:09Z) - Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer
Learning [67.40866334083941]
We propose an end-to-end 3-D lightweight convolutional neural network (CNN) for limited samples-based HSI classification.
Compared with conventional 3-D-CNN models, the proposed 3-D-LWNet has a deeper network structure, less parameters, and lower computation cost.
Our model achieves competitive performance for HSI classification compared to several state-of-the-art methods.
arXiv Detail & Related papers (2020-12-07T03:44:35Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z) - Weakly-supervised land classification for coastal zone based on deep
convolutional neural networks by incorporating dual-polarimetric
characteristics into training dataset [1.125851164829582]
We explore the performance of DCNNs on semantic segmentation using spaceborne polarimetric synthetic aperture radar (PolSAR) datasets.
The semantic segmentation task using PolSAR data can be categorized as weakly supervised learning when the characteristics of SAR data and data annotating procedures are factored in.
Three DCNN models, including SegNet, U-Net, and LinkNet, are implemented next.
arXiv Detail & Related papers (2020-03-30T17:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.