Online Invariance Selection for Local Feature Descriptors
- URL: http://arxiv.org/abs/2007.08988v3
- Date: Thu, 23 Jul 2020 15:16:23 GMT
- Title: Online Invariance Selection for Local Feature Descriptors
- Authors: R\'emi Pautrat, Viktor Larsson, Martin R. Oswald and Marc Pollefeys
- Abstract summary: A limitation of current feature descriptors is the trade-off between generalization and discriminative power.
We propose to overcome this limitation with a disentanglement of invariance in local descriptors and with an online selection of the most appropriate invariance given the context.
We demonstrate that our method can boost the performance of current descriptors and outperforms state-of-the-art descriptors in several matching tasks.
- Score: 93.32949876169785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To be invariant, or not to be invariant: that is the question formulated in
this work about local descriptors. A limitation of current feature descriptors
is the trade-off between generalization and discriminative power: more
invariance means less informative descriptors. We propose to overcome this
limitation with a disentanglement of invariance in local descriptors and with
an online selection of the most appropriate invariance given the context. Our
framework consists in a joint learning of multiple local descriptors with
different levels of invariance and of meta descriptors encoding the regional
variations of an image. The similarity of these meta descriptors across images
is used to select the right invariance when matching the local descriptors. Our
approach, named Local Invariance Selection at Runtime for Descriptors (LISRD),
enables descriptors to adapt to adverse changes in images, while remaining
discriminative when invariance is not required. We demonstrate that our method
can boost the performance of current descriptors and outperforms
state-of-the-art descriptors in several matching tasks, when evaluated on
challenging datasets with day-night illumination as well as viewpoint changes.
Related papers
- Feature Aligning Few shot Learning Method Using Local Descriptors Weighted Rules [0.0]
Few-shot classification involves identifying new categories using a limited number of labeled samples.
This paper proposes a Feature Aligning Few-shot Learning Method Using Local Descriptors Weighted Rules (FAFD-LDWR)
It innovatively introduces a cross-normalization method into few-shot image classification to preserve the discriminative information of local descriptors as much as possible; and enhances classification performance by aligning key local descriptors of support and query sets to remove background noise.
arXiv Detail & Related papers (2024-08-26T11:36:38Z) - A Simple Task-aware Contrastive Local Descriptor Selection Strategy for Few-shot Learning between inter class and intra class [6.204356280380338]
Few-shot image classification aims to classify novel classes with few labeled samples.
Recent research indicates that deep local descriptors have better representational capabilities.
This paper proposes a novel task-aware contrastive local descriptor selection network (TCDSNet)
arXiv Detail & Related papers (2024-08-12T07:04:52Z) - Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - Context-aware Difference Distilling for Multi-change Captioning [106.72151597074098]
Multi-change captioning aims to describe complex and coupled changes within an image pair in natural language.
We propose a novel context-aware difference distilling network to capture all genuine changes for yielding sentences.
arXiv Detail & Related papers (2024-05-31T14:07:39Z) - TALDS-Net: Task-Aware Adaptive Local Descriptors Selection for Few-shot Image Classification [6.204356280380338]
Few-shot image classification aims to classify images from unseen novel classes with few samples.
Recent works demonstrate that deep local descriptors exhibit enhanced representational capabilities compared to image-level features.
We propose a novel Task-Aware Adaptive Local Descriptors Selection Network (TALDS-Net)
arXiv Detail & Related papers (2023-12-09T03:33:14Z) - Text Descriptions are Compressive and Invariant Representations for
Visual Learning [63.3464863723631]
We show that an alternative approach, in line with humans' understanding of multiple visual features per class, can provide compelling performance in the robust few-shot learning setting.
In particular, we introduce a novel method, textit SLR-AVD (Sparse Logistic Regression using Augmented Visual Descriptors).
This method first automatically generates multiple visual descriptions of each class via a large language model (LLM), then uses a VLM to translate these descriptions to a set of visual feature embeddings of each image, and finally uses sparse logistic regression to select a relevant subset of these features to classify
arXiv Detail & Related papers (2023-07-10T03:06:45Z) - Learning Rotation-Equivariant Features for Visual Correspondence [41.79256655501003]
We introduce a self-supervised learning framework to extract discriminative rotation-invariant descriptors.
Thanks to employing group-equivariant CNNs, our method effectively learns to obtain rotation-equivariant features and their orientations explicitly.
Our method demonstrates state-of-the-art matching accuracy among existing rotation-invariant descriptors under varying rotation.
arXiv Detail & Related papers (2023-03-25T13:42:07Z) - Visual Classification via Description from Large Language Models [23.932495654407425]
Vision-language models (VLMs) have shown promising performance on a variety of recognition tasks.
We present an alternative framework for classification with VLMs, which we call classification by description.
arXiv Detail & Related papers (2022-10-13T17:03:46Z) - VLAD-VSA: Cross-Domain Face Presentation Attack Detection with
Vocabulary Separation and Adaptation [87.9994254822078]
For face presentation attack (PAD), most of the spoofing cues are subtle, local image patterns.
VLAD aggregation method is adopted to quantize local features with visual vocabulary locally partitioning the feature space.
Proposed vocabulary separation method divides vocabulary into domain-shared and domain-specific visual words.
arXiv Detail & Related papers (2022-02-21T15:27:41Z) - Same Features, Different Day: Weakly Supervised Feature Learning for
Seasonal Invariance [65.94499390875046]
"Like night and day" is a commonly used expression to imply that two things are completely different.
The aim of this paper is to provide a dense feature representation that can be used to perform localization, sparse matching or image retrieval.
We propose Deja-Vu, a weakly supervised approach to learning season invariant features that does not require pixel-wise ground truth data.
arXiv Detail & Related papers (2020-03-30T12:56:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.