MTLDesc: Looking Wider to Describe Better
- URL: http://arxiv.org/abs/2203.07003v1
- Date: Mon, 14 Mar 2022 11:16:05 GMT
- Title: MTLDesc: Looking Wider to Describe Better
- Authors: Changwei Wang, Rongtao Xu, Yuyang Zhang, Shibiao Xu, Weiliang Meng,
Bin Fan, Xiaopeng Zhang
- Abstract summary: We focus on making local descriptors "look wider to describe better"
We resort to context augmentation and spatial attention mechanisms to make our MTLDesc obtain non-local awareness.
Our MTLDesc significantly surpasses the prior state-of-the-art local descriptors on HPatches, Aachen Day-Night localization and InLoc indoor localization benchmarks.
- Score: 21.81401301082768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Limited by the locality of convolutional neural networks, most existing local
features description methods only learn local descriptors with local
information and lack awareness of global and surrounding spatial context. In
this work, we focus on making local descriptors "look wider to describe better"
by learning local Descriptors with More Than just Local information (MTLDesc).
Specifically, we resort to context augmentation and spatial attention
mechanisms to make our MTLDesc obtain non-local awareness. First, Adaptive
Global Context Augmented Module and Diverse Local Context Augmented Module are
proposed to construct robust local descriptors with context information from
global to local. Second, Consistent Attention Weighted Triplet Loss is designed
to integrate spatial attention awareness into both optimization and matching
stages of local descriptors learning. Third, Local Features Detection with
Feature Pyramid is given to obtain more stable and accurate keypoints
localization. With the above innovations, the performance of our MTLDesc
significantly surpasses the prior state-of-the-art local descriptors on
HPatches, Aachen Day-Night localization and InLoc indoor localization
benchmarks.
Related papers
- FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [57.59857784298536]
Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space.
We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework.
We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
arXiv Detail & Related papers (2024-08-21T23:42:16Z) - LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
Propagation in Transformers [60.51925353387151]
We propose a novel module named Local Context Propagation (LCP) to exploit the message passing between neighboring local regions.
We use the overlap points of adjacent local regions as intermediaries, then re-weight the features of these shared points from different local regions before passing them to the next layers.
The proposed method is applicable to different tasks and outperforms various transformer-based methods in benchmarks including 3D shape classification and dense prediction tasks.
arXiv Detail & Related papers (2022-10-23T15:43:01Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - LCTR: On Awakening the Local Continuity of Transformer for Weakly
Supervised Object Localization [38.376238216214524]
Weakly supervised object localization (WSOL) aims to learn object localizer solely by using image-level labels.
We propose a novel framework built upon the transformer, termed LCTR, which targets at enhancing the local perception capability of global features.
arXiv Detail & Related papers (2021-12-10T01:48:40Z) - An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot
Learning [77.72330187258498]
We propose a novel Entropy-guided Reinforced Partial Convolutional Network (ERPCNet)
ERPCNet extracts and aggregates localities based on semantic relevance and visual correlations without human-annotated regions.
It not only discovers global-cooperative localities dynamically but also converges faster for policy gradient optimization.
arXiv Detail & Related papers (2021-11-03T11:13:13Z) - Capturing Structural Locality in Non-parametric Language Models [85.94669097485992]
We propose a simple yet effective approach for adding locality information into non-parametric language models.
Experiments on two different domains, Java source code and Wikipedia text, demonstrate that locality features improve model efficacy.
arXiv Detail & Related papers (2021-10-06T15:53:38Z) - LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place
Recognition [31.105598103211825]
We show that an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits.
We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net.
arXiv Detail & Related papers (2021-09-17T03:32:43Z) - Local Context Attention for Salient Object Segmentation [5.542044768017415]
We propose a novel Local Context Attention Network (LCANet) to generate locally reinforcement feature maps in a uniform representational architecture.
The proposed network introduces an Attentional Correlation Filter (ACF) module to generate explicit local attention by calculating the correlation feature map between coarse prediction and global context.
Comprehensive experiments are conducted on several salient object segmentation datasets, demonstrating the superior performance of the proposed LCANet against the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-24T09:20:06Z) - Learning Local Features with Context Aggregation for Visual Localization [24.167882373322957]
Keypoint detection and description is fundamental yet important in many vision applications.
Most existing methods use detect-then-describe or detect-and-describe strategy to learn local features without considering their context information.
In this paper, we focus on the fusion of low-level textual information and high-level semantic context information to improve the discrimitiveness of local features.
arXiv Detail & Related papers (2020-05-26T17:19:06Z) - LRC-Net: Learning Discriminative Features on Point Clouds by Encoding
Local Region Contexts [65.79931333193016]
We present a novel Local-Region-Context Network (LRC-Net) to learn discriminative features on point clouds.
LRC-Net encodes fine-grained contexts inside and among local regions simultaneously.
Results show LRC-Net is competitive with state-of-the-art methods in shape classification and shape segmentation applications.
arXiv Detail & Related papers (2020-03-18T14:34:08Z) - Ground Texture Based Localization Using Compact Binary Descriptors [12.160708336715489]
Ground texture based localization is a promising approach to achieve high-accuracy positioning of vehicles.
We present a self-contained method that can be used for global localization as well as for subsequent local localization updates.
arXiv Detail & Related papers (2020-02-25T17:31:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.