Long-Term Invariant Local Features via Implicit Cross-Domain
Correspondences
- URL: http://arxiv.org/abs/2311.03345v1
- Date: Mon, 6 Nov 2023 18:53:01 GMT
- Title: Long-Term Invariant Local Features via Implicit Cross-Domain
Correspondences
- Authors: Zador Pataki, Mohammad Altillawi, Menelaos Kanakis, R\'emi Pautrat,
Fengyi Shen, Ziyuan Liu, Luc Van Gool, and Marc Pollefeys
- Abstract summary: We conduct a thorough analysis of the performance of current state-of-the-art feature extraction networks under various domain changes.
We propose a novel data-centric method, Implicit Cross-Domain Correspondences (iCDC)
iCDC represents the same environment with multiple Neural Radiance Fields, each fitting the scene under individual visual domains.
- Score: 79.21515035128832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern learning-based visual feature extraction networks perform well in
intra-domain localization, however, their performance significantly declines
when image pairs are captured across long-term visual domain variations, such
as different seasonal and daytime variations. In this paper, our first
contribution is a benchmark to investigate the performance impact of long-term
variations on visual localization. We conduct a thorough analysis of the
performance of current state-of-the-art feature extraction networks under
various domain changes and find a significant performance gap between intra-
and cross-domain localization. We investigate different methods to close this
gap by improving the supervision of modern feature extractor networks. We
propose a novel data-centric method, Implicit Cross-Domain Correspondences
(iCDC). iCDC represents the same environment with multiple Neural Radiance
Fields, each fitting the scene under individual visual domains. It utilizes the
underlying 3D representations to generate accurate correspondences across
different long-term visual conditions. Our proposed method enhances
cross-domain localization performance, significantly reducing the performance
gap. When evaluated on popular long-term localization benchmarks, our trained
networks consistently outperform existing methods. This work serves as a
substantial stride toward more robust visual localization pipelines for
long-term deployments, and opens up research avenues in the development of
long-term invariant descriptors.
Related papers
- Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models [3.072340427031969]
Few-shot action recognition (FSAR) aims to learn a model capable of identifying novel actions in videos using only a few examples.
In assuming the base dataset seen during meta-training and novel dataset used for evaluation can come from different domains, cross-domain few-shot learning alleviates data collection and annotation costs.
We systematically evaluate existing state-of-the-art single-domain, transfer-based, and cross-domain FSAR methods on new cross-domain tasks.
arXiv Detail & Related papers (2024-06-03T07:48:18Z) - Contrastive Domain Adaptation for Time-Series via Temporal Mixup [14.723714504015483]
We propose a novel lightweight contrastive domain adaptation framework called CoTMix for time-series data.
Specifically, we propose a novel temporal mixup strategy to generate two intermediate augmented views for the source and target domains.
Our approach can significantly outperform all state-of-the-art UDA methods.
arXiv Detail & Related papers (2022-12-03T06:53:38Z) - PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains.
Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z) - AFAN: Augmented Feature Alignment Network for Cross-Domain Object
Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications.
We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training.
Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z) - STA-VPR: Spatio-temporal Alignment for Visual Place Recognition [17.212503755962757]
We propose an adaptive dynamic time warping algorithm to align local features from the spatial domain while measuring the distance between two images.
A local matching DTW algorithm is applied to perform image sequence matching based on temporal alignment.
The results show that the proposed method significantly improves the CNN-based methods.
arXiv Detail & Related papers (2021-03-25T03:27:42Z) - Domain Adaptation of Learned Features for Visual Localization [60.6817896667435]
We tackle the problem of visual localization under changing conditions, such as time of day, weather, and seasons.
Recent learned local features based on deep neural networks have shown superior performance over classical hand-crafted local features.
We present a novel and practical approach, where only a few examples are needed to reduce the domain gap.
arXiv Detail & Related papers (2020-08-21T05:17:32Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation [62.29076080124199]
This paper proposes a novel coarse-to-fine feature adaptation approach to cross-domain object detection.
At the coarse-grained stage, foreground regions are extracted by adopting the attention mechanism, and aligned according to their marginal distributions.
At the fine-grained stage, we conduct conditional distribution alignment of foregrounds by minimizing the distance of global prototypes with the same category but from different domains.
arXiv Detail & Related papers (2020-03-23T13:40:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.