Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments
- URL: http://arxiv.org/abs/2503.08843v1
- Date: Tue, 11 Mar 2025 19:29:28 GMT
- Title: Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments
- Authors: Rajitha de Silva, Jonathan Cox, Marija Popovic, Cesar Cadena, Cyrill Stachniss, Riccardo Polvara,
- Abstract summary: We introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image.<n>Our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.
- Score: 22.03227809496743
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust robot navigation in outdoor environments requires accurate perception systems capable of handling visual challenges such as repetitive structures and changing appearances. Visual feature matching is crucial to vision-based pipelines but remains particularly challenging in natural outdoor settings due to perceptual aliasing. We address this issue in vineyards, where repetitive vine trunks and other natural elements generate ambiguous descriptors that hinder reliable feature matching. We hypothesise that semantic information tied to keypoint positions can alleviate perceptual aliasing by enhancing keypoint descriptor distinctiveness. To this end, we introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image, enabling more accurate differentiation even among visually similar local features. We validate this approach in two vineyard perception tasks: (i) relative pose estimation and (ii) visual localisation. Across all tested keypoint types and descriptors, our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.
Related papers
- RADA: Robust and Accurate Feature Learning with Domain Adaptation [7.905594146253435]
We introduce a multi-level feature aggregation network that incorporates two pivotal components to facilitate the learning of robust and accurate features.<n>Our method, RADA, achieves excellent results in image matching, camera pose estimation, and visual localization tasks.
arXiv Detail & Related papers (2024-07-22T16:49:58Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - Enhancing Deformable Local Features by Jointly Learning to Detect and
Describe Keypoints [8.390939268280235]
Local feature extraction is a standard approach in computer vision for tackling important tasks such as image matching and retrieval.
We propose DALF, a novel deformation-aware network for jointly detecting and describing keypoints.
Our approach also enhances the performance of two real-world applications: deformable object retrieval and non-rigid 3D surface registration.
arXiv Detail & Related papers (2023-04-02T18:01:51Z) - Semantic Prompt for Few-Shot Image Recognition [76.68959583129335]
We propose a novel Semantic Prompt (SP) approach for few-shot learning.
The proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision
Transformer for Face Forgery Detection [52.91782218300844]
We propose a novel Unsupervised Inconsistency-Aware method based on Vision Transformer, called UIA-ViT.
Due to the self-attention mechanism, the attention map among patch embeddings naturally represents the consistency relation, making the vision Transformer suitable for the consistency representation learning.
arXiv Detail & Related papers (2022-10-23T15:24:47Z) - Kinship Verification Based on Cross-Generation Feature Interaction
Learning [53.62256887837659]
Kinship verification from facial images has been recognized as an emerging yet challenging technique in computer vision applications.
We propose a novel cross-generation feature interaction learning (CFIL) framework for robust kinship verification.
arXiv Detail & Related papers (2021-09-07T01:50:50Z) - Intriguing Properties of Vision Transformers [114.28522466830374]
Vision transformers (ViT) have demonstrated impressive performance across various machine vision problems.
We systematically study this question via an extensive set of experiments and comparisons with a high-performing convolutional neural network (CNN)
We show effective features of ViTs are due to flexible receptive and dynamic fields possible via the self-attention mechanism.
arXiv Detail & Related papers (2021-05-21T17:59:18Z) - Discriminative and Semantic Feature Selection for Place Recognition
towards Dynamic Environments [12.973423183330961]
We propose a discriminative and semantic feature selection network, dubbed as DSFeat.
Supervised by both semantic information and attention mechanism, we can estimate pixel-wise stability of features.
It should be noticed that our proposal can be readily pluggable into any feature-based SLAM system.
arXiv Detail & Related papers (2021-03-18T05:11:46Z) - RoRD: Rotation-Robust Descriptors and Orthographic Views for Local
Feature Matching [32.10261486751993]
We present a novel framework that combines learning of invariant descriptors through data augmentation and viewpoint projection.
We evaluate the effectiveness of the proposed approach on key tasks including pose estimation and visual place recognition.
arXiv Detail & Related papers (2021-03-15T17:40:25Z) - Early Bird: Loop Closures from Opposing Viewpoints for
Perceptually-Aliased Indoor Environments [35.663671249819124]
We present novel research that simultaneously addresses viewpoint change and perceptual aliasing.
We show that our integration of VPR with SLAM significantly boosts the performance of VPR, feature correspondence, and pose graph submodules.
For the first time, we demonstrate a localization system capable of state-of-the-art performance despite perceptual aliasing and extreme 180-degree-rotated viewpoint change.
arXiv Detail & Related papers (2020-10-03T20:18:55Z) - Adversarial Graph Representation Adaptation for Cross-Domain Facial
Expression Recognition [86.25926461936412]
We propose a novel Adrialversa Graph Representation Adaptation (AGRA) framework that unifies graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair experiments on several popular benchmarks and show that the proposed AGRA framework achieves superior performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T13:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.