Related papers: A Novel Multi-scale Attention Feature Extraction Block for Aerial Remote Sensing Image Classification

A Novel Multi-scale Attention Feature Extraction Block for Aerial Remote Sensing Image Classification

URL: http://arxiv.org/abs/2308.14076v1
Date: Sun, 27 Aug 2023 11:49:46 GMT
Title: A Novel Multi-scale Attention Feature Extraction Block for Aerial Remote Sensing Image Classification
Authors: Chiranjibi Sitaula, Jagannath Aryal and Avik Bhattacharya
Abstract summary: We propose a novel plug-and-play multi-scale attention feature extraction block (MSAFEB) based on multi-scale convolution at two levels with skip connection. The experimental study on two benchmark VHR aerial RS image datasets (AID and NWPU) demonstrates that our proposal achieves a stable/consistent performance (minimum standard deviation of $0.002$) and competent overall classification performance (AID: 95.85% and NWPU: 94.09%)
Score: 9.388978548253755
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Classification of very high-resolution (VHR) aerial remote sensing (RS) images is a well-established research area in the remote sensing community as it provides valuable spatial information for decision-making. Existing works on VHR aerial RS image classification produce an excellent classification performance; nevertheless, they have a limited capability to well-represent VHR RS images having complex and small objects, thereby leading to performance instability. As such, we propose a novel plug-and-play multi-scale attention feature extraction block (MSAFEB) based on multi-scale convolution at two levels with skip connection, producing discriminative/salient information at a deeper/finer level. The experimental study on two benchmark VHR aerial RS image datasets (AID and NWPU) demonstrates that our proposal achieves a stable/consistent performance (minimum standard deviation of $0.002$) and competent overall classification performance (AID: 95.85\% and NWPU: 94.09\%).

Related papers

FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system. We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain. To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB) We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z)
Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks. MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization. MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z)
An Advanced Features Extraction Module for Remote Sensing Image Super-Resolution [0.5461938536945723]
We propose an advanced feature extraction module called Channel and Spatial Attention Feature Extraction (CSA-FE) Our proposed method helps the model focus on the specific channels and spatial locations containing high-frequency information so that the model can focus on relevant features and suppress irrelevant ones. Our model achieved superior performance compared to various existing models.
arXiv Detail & Related papers (2024-05-07T18:15:51Z)
Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification [1.4502611532302039]
Experimental results on two widely-used VHR RS datasets show that the proposed approach yields a competitive and stable/robust classification performance with the least standard deviation of 0.001. The highest overall accuracies on the AID and the NWPU datasets are 95.39% and 93.04%, respectively.
arXiv Detail & Related papers (2023-05-01T06:21:35Z)
Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels) We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA. By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z)
Supervised classification methods applied to airborne hyperspectral images: Comparative study using mutual information [0.0]
This paper investigates the performance of four supervised learning algorithms, namely, Support Vector Machines SVM, Random Forest RF, K-Nearest Neighbors KNN and Linear Discriminant Analysis LDA. The experiments have been performed on three real hyperspectral datasets taken from the NASA's Airborne Visible/Infrared Imaging Spectrometer Sensor AVIRIS and the Reflective Optics System Imaging Spectrometer ROSIS sensors.
arXiv Detail & Related papers (2022-10-27T13:39:08Z)
An Empirical Study of Remote Sensing Pretraining [117.90699699469639]
We conduct an empirical study of remote sensing pretraining (RSP) on aerial images. RSP can help deliver distinctive performances in scene recognition tasks. RSP mitigates the data discrepancies of traditional ImageNet pretraining on RS images, but it may still suffer from task discrepancies.
arXiv Detail & Related papers (2022-04-06T13:38:11Z)
High-resolution Depth Maps Imaging via Attention-based Hierarchical Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR. We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z)
MRDet: A Multi-Head Network for Accurate Oriented Object Detection in Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors. To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network. Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z)
Lightweight Single-Image Super-Resolution Network with Attentive Auxiliary Feature Learning [73.75457731689858]
We develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$2$F) for SISR. Experimental results on large-scale dataset demonstrate the effectiveness of the proposed model against the state-of-the-art (SOTA) SR methods.
arXiv Detail & Related papers (2020-11-13T06:01:46Z)
PolSAR Image Classification Based on Robust Low-Rank Feature Extraction and Markov Random Field [44.59934840513234]
We present a novel PolSAR image classification method, which removes speckle noise via low-rank (LR) feature extraction and enforces smoothness priors via Markov random field (MRF) Experimental results indicate that the proposed method achieves promising classification performance and preferable spatial consistency.
arXiv Detail & Related papers (2020-09-13T07:38:12Z)
Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.