Fusion and Grouping Strategies in Deep Learning for Local Climate Zone Classification of Multimodal Remote Sensing Data
- URL: http://arxiv.org/abs/2603.04562v1
- Date: Wed, 04 Mar 2026 19:47:13 GMT
- Title: Fusion and Grouping Strategies in Deep Learning for Local Climate Zone Classification of Multimodal Remote Sensing Data
- Authors: Ancymol Thomas, Jaya Sreevalsan-Nair,
- Abstract summary: Local Climate Zones (LCZs) give a zoning map to study urban structures and land use.<n>Data fusion is significant for improving accuracy owing to the data complexity.<n>This study analyzes different fusion strategies in the multi-class LCZ classification models.
- Score: 1.3607388598209322
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local Climate Zones (LCZs) give a zoning map to study urban structures and land use and analyze the impact of urbanization on local climate. Multimodal remote sensing enables LCZ classification, for which data fusion is significant for improving accuracy owing to the data complexity. However, there is a gap in a comprehensive analysis of the fusion mechanisms used in their deep learning (DL) classifier architectures. This study analyzes different fusion strategies in the multi-class LCZ classification models for multimodal data and grouping strategies based on inherent data characteristics. The different models involving Convolutional Neural Networks (CNNs) include: (i) baseline hybrid fusion (FM1), (ii) with self- and cross-attention mechanisms (FM2), (iii) with the multi-scale Gaussian filtered images (FM3), and (iv) weighted decision-level fusion (FM4). Ablation experiments are conducted to study the pixel-, feature-, and decision-level fusion effects in the model performance. Grouping strategies include band grouping (BG) within the data modalities and label merging (LM) in the ground truth. Our analysis is exclusively done on the So2Sat LCZ42 dataset, which consists of Synthetic Aperture Radar (SAR) and Multispectral Imaging (MSI) image pairs. Our results show that FM1 consistently outperforms simple fusion methods. FM1 with BG and LM is found to be the most effective approach among all fusion strategies, giving an overall accuracy of 76.6\%. Importantly, our study highlights the effect of these strategies in improving prediction accuracy for the underrepresented classes. Our code and processed datasets are available at https://github.com/GVCL/LCZC-MultiModalHybridFusion
Related papers
- Multiview Manifold Evidential Fusion for PolSAR Image Classification [51.41332458376411]
We propose a new framework to integrate PolSAR manifold learning and evidence fusion into a unified architecture.<n>Experiments on three real-world PolSAR datasets demonstrate that the proposed method consistently outperforms existing approaches in accuracy, robustness, and interpretability.
arXiv Detail & Related papers (2025-10-13T09:05:51Z) - TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation [65.74990259650984]
We introduce TerraFM, a scalable self-supervised learning model that leverages globally distributed Sentinel-1 and Sentinel-2 imagery.<n>Our training strategy integrates local-global contrastive learning and introduces a dual-centering mechanism.<n>TerraFM achieves strong generalization on both classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
arXiv Detail & Related papers (2025-06-06T17:59:50Z) - FTA-FTL: A Fine-Tuned Aggregation Federated Transfer Learning Scheme for Lithology Microscopic Image Classification [4.245694283697248]
This study involves two phases; the first is to conduct Lithology microscopic image classification on a small dataset using transfer learning.<n>In the second phase, we formulated the classification task to a Federated Transfer Learning scheme and proposed a Fine-Tuned Aggregation strategy for Federated Learning (FTA-FTL)<n>The results are in excellent agreement and confirm the efficiency of the proposed scheme, and show that the proposed FTA-FTL algorithm is capable enough to achieve approximately the same results obtained by the centralized implementation for Lithology microscopic images classification task.
arXiv Detail & Related papers (2025-01-06T19:32:14Z) - Band Prompting Aided SAR and Multi-Spectral Data Fusion Framework for Local Climate Zone Classification [20.71392764471532]
Local climate zone (LCZ) classification is of great value for understanding the complex interactions between urban development and local climate.<n>Recent studies have increasingly focused on the fusion of synthetic aperture radar (SAR) and multi-spectral data to improve LCZ classification performance.<n>In this paper, a novel band prompting aided data fusion framework is proposed for LCZ classification, namely BP-LCZ.<n>The experimental results demonstrate the effectiveness and superiority of the proposed data fusion framework.
arXiv Detail & Related papers (2024-12-24T07:40:07Z) - PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model [83.35198885088093]
PolSAR data presents unique challenges due to its rich and complex characteristics.<n>Existing data representations, such as complex-valued data, polarimetric features, and amplitude images, are widely used.<n>Most feature extraction networks for PolSAR are small, limiting their ability to capture features effectively.<n>We propose the Polarimetric Scattering Mechanism-Informed SAM (PolSAM), an enhanced Segment Anything Model (SAM) that integrates domain-specific scattering characteristics and a novel prompt generation strategy.
arXiv Detail & Related papers (2024-12-17T09:59:53Z) - Bridging Data Islands: Geographic Heterogeneity-Aware Federated Learning for Collaborative Remote Sensing Semantic Segmentation [7.265569559979736]
High-quality annotated remote sensing images are often isolated and distributed across institutions.<n>The issue of remote sensing data islands poses challenges for fully utilizing isolated datasets to train a global model.<n>We propose a novel Geographic heterogeneity-aware Federated learning (GeoFed) framework to bridge data islands in RSS.<n>Our framework consists of three modules, including the Global Insight Enhancement (GIE) module, the Essential Feature Mining (EFM) module and the Local-Global Balance (LoGo) module.
arXiv Detail & Related papers (2024-04-14T15:58:35Z) - In the Search for Optimal Multi-view Learning Models for Crop Classification with Global Remote Sensing Data [5.143097874851516]
We use the CropHarvest dataset for validation, which provides optical, radar, weather time series, and topographic information as input data.
We suggest identifying the optimal encoder architecture tailored for a particular fusion strategy, and then determining the most suitable fusion strategy for the classification task.
arXiv Detail & Related papers (2024-03-25T09:49:42Z) - A Theoretical Analysis of Self-Supervised Learning for Vision Transformers [66.08606211686339]
Masked autoencoders (MAE) and contrastive learning (CL) capture different types of representations.<n>We study the training dynamics of one-layer softmax-based vision transformers (ViTs) on both MAE and CL objectives.
arXiv Detail & Related papers (2024-03-04T17:24:03Z) - Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - Classification of Hyperspectral and LiDAR Data Using Coupled CNNs [39.55503477017984]
We propose an efficient framework to fuse hyperspectral and Light Detection And Ranging (LiDAR) data using two coupled convolutional neural networks (CNNs)
One CNN is designed to learn spectral-spatial features from hyperspectral data, the other is used to capture the elevation information from LiDAR data.
In the fusion phase, feature-level and decision-level fusion methods are simultaneously used to integrate these heterogeneous features.
arXiv Detail & Related papers (2020-02-04T06:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.