Related papers: Efficient Match Pair Retrieval for Large-scale UAV Images via Graph Indexed Global Descriptor

Efficient Match Pair Retrieval for Large-scale UAV Images via Graph Indexed Global Descriptor

URL: http://arxiv.org/abs/2307.04520v1
Date: Mon, 10 Jul 2023 12:41:55 GMT
Title: Efficient Match Pair Retrieval for Large-scale UAV Images via Graph Indexed Global Descriptor
Authors: San Jiang, Yichen Ma, Qingquan Li, Wanshou Jiang, Bingxuan Guo, Lelin Li, Lizhe Wang
Abstract summary: This paper proposes an efficient match pair retrieval method and implements an integrated workflow for parallel SfM reconstruction. The proposed solution has been verified using three large-scale datasets.
Score: 9.402103660431791
License: http://creativecommons.org/licenses/by/4.0/
Abstract: SfM (Structure from Motion) has been extensively used for UAV (Unmanned Aerial Vehicle) image orientation. Its efficiency is directly influenced by feature matching. Although image retrieval has been extensively used for match pair selection, high computational costs are consumed due to a large number of local features and the large size of the used codebook. Thus, this paper proposes an efficient match pair retrieval method and implements an integrated workflow for parallel SfM reconstruction. First, an individual codebook is trained online by considering the redundancy of UAV images and local features, which avoids the ambiguity of training codebooks from other datasets. Second, local features of each image are aggregated into a single high-dimension global descriptor through the VLAD (Vector of Locally Aggregated Descriptors) aggregation by using the trained codebook, which remarkably reduces the number of features and the burden of nearest neighbor searching in image indexing. Third, the global descriptors are indexed via the HNSW (Hierarchical Navigable Small World) based graph structure for the nearest neighbor searching. Match pairs are then retrieved by using an adaptive threshold selection strategy and utilized to create a view graph for divide-and-conquer based parallel SfM reconstruction. Finally, the performance of the proposed solution has been verified using three large-scale UAV datasets. The test results demonstrate that the proposed solution accelerates match pair retrieval with a speedup ratio ranging from 36 to 108 and improves the efficiency of SfM reconstruction with competitive accuracy in both relative and absolute orientation.

Related papers

UAVPairs: A Challenging Benchmark for Match Pair Retrieval of Large-scale UAV Images [8.607887740177802]
This paper contributes a benchmark dataset, UAVPairs, and a training pipeline designed for match pair retrieval of large-scale UAV images.<n>The UAVPairs dataset, comprising 21,622 high-resolution images across 30 diverse scenes, is constructed.<n>The effectiveness of the UAVPairs dataset and training pipeline is validated through comprehensive experiments on three distinct large-scale UAV datasets.
arXiv Detail & Related papers (2025-05-28T08:21:05Z)
Fast Feature Matching of UAV Images via Matrix Band Reduction-based GPU Data Schedule [5.104096700315428]
The proposed algorithm is an efficient solution for feature matching of UAV images.<n>It increases the efficiency of feature matching with speedup ratios ranging from 77.0 to 100.0 compared with KD-Tree based matching methods.
arXiv Detail & Related papers (2025-05-28T08:12:12Z)
Efficient Visual State Space Model for Image Deblurring [99.54894198086852]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.<n>We propose a simple yet effective visual state space model (EVSSM) for image deblurring.<n>The proposed EVSSM performs favorably against state-of-the-art methods on benchmark datasets and real-world images.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
AIR-HLoc: Adaptive Retrieved Images Selection for Efficient Visual Localisation [8.789742514363777]
State-of-the-art hierarchical localisation pipelines (HLoc) employ image retrieval (IR) to establish 2D-3D correspondences. This paper investigates the relationship between global and local descriptors. We propose an adaptive strategy that adjusts $k$ based on the similarity between the query's global descriptor and those in the database.
arXiv Detail & Related papers (2024-03-27T06:17:21Z)
Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval [92.13664084464514]
The task of composed image retrieval (CIR) aims to retrieve images based on the query image and the text describing the users' intent. Existing methods have made great progress with the advanced large vision-language (VL) model in CIR task, however, they generally suffer from two main issues: lack of labeled triplets for model training and difficulty of deployment on resource-restricted environments. We propose Image2Sentence based Asymmetric zero-shot composed image retrieval (ISA), which takes advantage of the VL model and only relies on unlabeled images for composition learning.
arXiv Detail & Related papers (2024-03-03T07:58:03Z)
Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input. We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z)
Graph Convolution Based Efficient Re-Ranking for Visual Retrieval [29.804582207550478]
We present an efficient re-ranking method which refines initial retrieval results by updating features. Specifically, we reformulate re-ranking based on Graph Convolution Networks (GCN) and propose a novel Graph Convolution based Re-ranking (GCR) for visual retrieval tasks via feature propagation. In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras.
arXiv Detail & Related papers (2023-06-15T00:28:08Z)
Parallel Structure from Motion for UAV Images via Weighted Connected Dominating Set [5.17395782758526]
This paper proposes an algorithm to extract the global model for cluster merging and designs a parallel SfM solution to achieve efficient and accurate UAV image orientation. The experimental results demonstrate that the proposed parallel SfM can achieve 17.4 times efficiency improvement and comparative orientation accuracy.
arXiv Detail & Related papers (2022-06-23T06:53:06Z)
Reuse your features: unifying retrieval and feature-metric alignment [3.845387441054033]
DRAN is the first network able to produce the features for the three steps of visual localization. It achieves competitive performance in terms of robustness and accuracy under challenging conditions in public benchmarks.
arXiv Detail & Related papers (2022-04-13T10:42:00Z)
Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects [70.49392581592089]
We tackle the task of estimating the 3D orientation of previously-unseen objects from monocular images. We follow a retrieval-based strategy and prevent the network from learning object-specific features. Our experiments on the LineMOD, LineMOD-Occluded, and T-LESS datasets show that our method yields a significantly better generalization to unseen objects than previous works.
arXiv Detail & Related papers (2022-03-16T08:53:00Z)
Combined Depth Space based Architecture Search For Person Re-identification [70.86236888223569]
We aim to design a lightweight and suitable network for person re-identification (ReID) We propose a novel search space called Combined Depth Space (CDS), based on which we search for an efficient network architecture, which we call CDNet. We then propose a low-cost search strategy named the Top-k Sample Search strategy to make full use of the search space and avoid trapping in local optimal result.
arXiv Detail & Related papers (2021-04-09T02:40:01Z)
Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers [115.90778814368703]
Our objective is language-based search of large-scale image and video datasets. For this task, the approach that consists of independently mapping text and vision to a joint embedding space, a.k.a. dual encoders, is attractive as retrieval scales. An alternative approach of using vision-text transformers with cross-attention gives considerable improvements in accuracy over the joint embeddings.
arXiv Detail & Related papers (2021-03-30T17:57:08Z)
Image Retrieval for Structure-from-Motion via Graph Convolutional Network [13.040952255039702]
We present a novel retrieval method based on Graph Convolutional Network (GCN) to generate accurate pairwise matches without costly redundancy. By constructing a subgraph surrounding the query image as input data, we adopt a learnable GCN to exploit whether nodes in the subgraph have overlapping regions with the query photograph. Experiments demonstrate that our method performs remarkably well on the challenging dataset of highly ambiguous and duplicated scenes.
arXiv Detail & Related papers (2020-09-17T04:03:51Z)
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment. Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.