Rotation Invariant Aerial Image Retrieval with Group Convolutional
Metric Learning
- URL: http://arxiv.org/abs/2010.09202v1
- Date: Mon, 19 Oct 2020 04:12:36 GMT
- Title: Rotation Invariant Aerial Image Retrieval with Group Convolutional
Metric Learning
- Authors: Hyunseung Chung, Woo-Jeoung Nam, Seong-Whan Lee
- Abstract summary: We introduce a novel method for retrieving aerial images by merging group convolution with attention mechanism and metric learning.
Results show that the proposed method performance exceeds other state-of-the-art retrieval methods in both rotated and original environments.
- Score: 21.89786914625517
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote sensing image retrieval (RSIR) is the process of ranking database
images depending on the degree of similarity compared to the query image. As
the complexity of RSIR increases due to the diversity in shooting range, angle,
and location of remote sensors, there is an increasing demand for methods to
address these issues and improve retrieval performance. In this work, we
introduce a novel method for retrieving aerial images by merging group
convolution with attention mechanism and metric learning, resulting in
robustness to rotational variations. For refinement and emphasis on important
features, we applied channel attention in each group convolution stage. By
utilizing the characteristics of group convolution and channel-wise attention,
it is possible to acknowledge the equality among rotated but identically
located images. The training procedure has two main steps: (i) training the
network with Aerial Image Dataset (AID) for classification, (ii) fine-tuning
the network with triplet-loss for retrieval with Google Earth South Korea and
NWPU-RESISC45 datasets. Results show that the proposed method performance
exceeds other state-of-the-art retrieval methods in both rotated and original
environments. Furthermore, we utilize class activation maps (CAM) to visualize
the distinct difference of main features between our method and baseline,
resulting in better adaptability in rotated environments.
Related papers
- PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation [10.74841255987162]
In this paper, we numerically construct the padding-based rotation equivariant convolution mode (PreCM)
PreCM can be used not only for multi-scale images and convolution kernels, but also as a replacement component to replace multiple convolutions.
Experiments show that PreCM-based networks can achieve better segmentation performance than the original and data augmentation-based networks.
arXiv Detail & Related papers (2024-11-03T16:26:55Z) - Deep Learning Based Speckle Filtering for Polarimetric SAR Images. Application to Sentinel-1 [51.404644401997736]
We propose a complete framework to remove speckle in polarimetric SAR images using a convolutional neural network.
Experiments show that the proposed approach offers exceptional results in both speckle reduction and resolution preservation.
arXiv Detail & Related papers (2024-08-28T10:07:17Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Advancing Image Retrieval with Few-Shot Learning and Relevance Feedback [5.770351255180495]
Image Retrieval with Relevance Feedback (IRRF) involves iterative human interaction during the retrieval process.
We propose a new scheme based on a hyper-network, that is tailored to the task and facilitates swift adjustment to user feedback.
We show that our method can attain SoTA results in few-shot one-class classification and reach comparable results in binary classification task of few-shot open-set recognition.
arXiv Detail & Related papers (2023-12-18T10:20:28Z) - Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image
Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising.
We propose rank-enhanced low-dimensional convolution set (Re-ConvSet)
We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z) - Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and
Local Information [15.32353270625554]
Cross-modal remote sensing text-image retrieval (RSCTIR) has recently become an urgent research hotspot due to its ability of enabling fast and flexible information extraction on remote sensing (RS) images.
We first propose a novel RSCTIR framework based on global and local information (GaLR), and design a multi-level information dynamic fusion (MIDF) module to efficaciously integrate features of different levels.
Experiments on public datasets strongly demonstrate the state-of-the-art performance of GaLR methods on the RSCTIR task.
arXiv Detail & Related papers (2022-04-21T03:18:09Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Learning Test-time Augmentation for Content-based Image Retrieval [42.188013259368766]
Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks.
Existing image retrieval approaches require fine-tuning or modification of pre-trained networks to adapt to variations unique to the target data.
Our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented at test-time, with augmentations guided by a policy learned through reinforcement learning.
arXiv Detail & Related papers (2020-02-05T05:08:41Z) - A Two-Stream Symmetric Network with Bidirectional Ensemble for Aerial
Image Matching [24.089374888914143]
We propose a novel method to precisely match two aerial images that were obtained in different environments via a two-stream deep network.
By internally augmenting the target image, the network considers the two-stream with the three input images and reflects the additional augmented pair in the training.
arXiv Detail & Related papers (2020-02-04T14:38:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.