NeRF-Supervised Feature Point Detection and Description
- URL: http://arxiv.org/abs/2403.08156v2
- Date: Tue, 30 Jul 2024 10:32:30 GMT
- Title: NeRF-Supervised Feature Point Detection and Description
- Authors: Ali Youssef, Francisco Vasconcelos,
- Abstract summary: This paper presents a novel approach leveraging Neural Radiance Fields (NeRFs) to generate a diverse and realistic dataset consisting of indoor and outdoor scenes.
Our proposed methodology adapts state-of-the-art feature detectors and descriptors for training on multi-view NeRF-synthesised data, with supervision achieved through perspective projective geometry.
- Score: 2.7388340826497837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature point detection and description is the backbone for various computer vision applications, such as Structure-from-Motion, visual SLAM, and visual place recognition. While learning-based methods have surpassed traditional handcrafted techniques, their training often relies on simplistic homography-based simulations of multi-view perspectives, limiting model generalisability. This paper presents a novel approach leveraging Neural Radiance Fields (NeRFs) to generate a diverse and realistic dataset consisting of indoor and outdoor scenes. Our proposed methodology adapts state-of-the-art feature detectors and descriptors for training on multi-view NeRF-synthesised data, with supervision achieved through perspective projective geometry. Experiments demonstrate that the proposed methodology achieves competitive or superior performance on standard benchmarks for relative pose estimation, point cloud registration, and homography estimation while requiring significantly less training data and time compared to existing approaches.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Training Spatial-Frequency Visual Prompts and Probabilistic Clusters for Accurate Black-Box Transfer Learning [35.72926400167876]
This paper proposes a novel parameter-efficient transfer learning framework for vision recognition models in the black-box setting.
In experiments, our model demonstrates superior performance in a few-shot transfer learning setting across extensive visual recognition datasets.
arXiv Detail & Related papers (2024-08-15T05:35:52Z) - Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling [70.34875558830241]
We present a way for learning a-temporal (4D) embedding, based on semantic semantic gears to allow for stratified modeling of dynamic regions of rendering the scene.
At the same time, almost for free, our tracking approach enables free-viewpoint of interest - a functionality not yet achieved by existing NeRF-based methods.
arXiv Detail & Related papers (2024-06-06T03:37:39Z) - Masked Modeling for Self-supervised Representation Learning on Vision
and Beyond [69.64364187449773]
Masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training.
We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more.
We conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research.
arXiv Detail & Related papers (2023-12-31T12:03:21Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - Towards a Robust Framework for NeRF Evaluation [11.348562090906576]
We propose a new test framework which isolates the neural rendering network from the Neural Radiance Field (NeRF) pipeline.
We then perform a parametric evaluation by training and evaluating the NeRF on an explicit radiance field representation.
Our approach offers the potential to create a comparative objective evaluation framework for NeRF methods.
arXiv Detail & Related papers (2023-05-29T13:30:26Z) - Learning dynamics from partial observations with structured neural ODEs [5.757156314867639]
We propose a flexible framework to incorporate a broad spectrum of physical insight into neural ODE-based system identification.
We demonstrate the performance of the proposed approach on numerical simulations and on an experimental dataset from a robotic exoskeleton.
arXiv Detail & Related papers (2022-05-25T07:54:10Z) - Explaining Convolutional Neural Networks through Attribution-Based Input
Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity.
However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power.
In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique.
We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Quantifying Challenges in the Application of Graph Representation
Learning [0.0]
We provide an application oriented perspective to a set of popular embedding approaches.
We evaluate their representational power with respect to real-world graph properties.
Our results suggest that "one-to-fit-all" GRL approaches are hard to define in real-world scenarios.
arXiv Detail & Related papers (2020-06-18T03:19:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.