HyperPose: Camera Pose Localization using Attention Hypernetworks
- URL: http://arxiv.org/abs/2303.02610v1
- Date: Sun, 5 Mar 2023 08:45:50 GMT
- Title: HyperPose: Camera Pose Localization using Attention Hypernetworks
- Authors: Ron Ferens, Yosi Keller
- Abstract summary: We propose the use of attention hypernetworks in camera pose localization.
The proposed approach achieves superior results compared to state-of-the-art methods on contemporary datasets.
- Score: 6.700873164609009
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we propose the use of attention hypernetworks in camera pose
localization. The dynamic nature of natural scenes, including changes in
environment, perspective, and lighting, creates an inherent domain gap between
the training and test sets that limits the accuracy of contemporary
localization networks. To overcome this issue, we suggest a camera pose
regressor that integrates a hypernetwork. During inference, the hypernetwork
generates adaptive weights for the localization regression heads based on the
input image, effectively reducing the domain gap. We also suggest the use of a
Transformer-Encoder as the hypernetwork, instead of the common multilayer
perceptron, to derive an attention hypernetwork. The proposed approach achieves
superior results compared to state-of-the-art methods on contemporary datasets.
To the best of our knowledge, this is the first instance of using hypernetworks
in camera pose regression, as well as using Transformer-Encoders as
hypernetworks. We make our code publicly available.
Related papers
- ReGround: Improving Textual and Spatial Grounding at No Cost [12.944046673902415]
spatial grounding often outweighs textual grounding due to the sequential flow from gated self-attention to cross-attention.
We demonstrate that such bias can be significantly mitigated without sacrificing accuracy in either grounding by simply rewiring the network architecture.
arXiv Detail & Related papers (2024-03-20T13:37:29Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - Alignment-free HDR Deghosting with Semantics Consistent Transformer [76.91669741684173]
High dynamic range imaging aims to retrieve information from multiple low-dynamic range inputs to generate realistic output.
Existing methods often focus on the spatial misalignment across input frames caused by the foreground and/or camera motion.
We propose a novel alignment-free network with a Semantics Consistent Transformer (SCTNet) with both spatial and channel attention modules.
arXiv Detail & Related papers (2023-05-29T15:03:23Z) - HyperE2VID: Improving Event-Based Video Reconstruction via Hypernetworks [16.432164340779266]
We propose HyperE2VID, a dynamic neural network architecture for event-based video reconstruction.
Our approach uses hypernetworks to generate per-pixel adaptive filters guided by a context fusion module.
arXiv Detail & Related papers (2023-05-10T18:00:06Z) - Magnitude Invariant Parametrizations Improve Hypernetwork Learning [0.0]
Hypernetworks are powerful neural networks that predict the parameters of another neural network.
Training typically converges far more slowly than for non-hypernetwork models.
We identify a fundamental and previously unidentified problem that contributes to the challenge of training hypernetworks.
We present a simple solution to this problem using a revised hypernetwork formulation that we call Magnitude Invariant Parametrizations (MIP)
arXiv Detail & Related papers (2023-04-15T22:18:29Z) - DLGSANet: Lightweight Dynamic Local and Global Self-Attention Networks
for Image Super-Resolution [83.47467223117361]
We propose an effective lightweight dynamic local and global self-attention network (DLGSANet) to solve image super-resolution.
Motivated by the network designs of Transformers, we develop a simple yet effective multi-head dynamic local self-attention (MHDLSA) module to extract local features efficiently.
To overcome this problem, we develop a sparse global self-attention (SparseGSA) module to select the most useful similarity values.
arXiv Detail & Related papers (2023-01-05T12:06:47Z) - HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing [2.362412515574206]
HyperStyle learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space.
HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders.
arXiv Detail & Related papers (2021-11-30T18:56:30Z) - Global and Local Alignment Networks for Unpaired Image-to-Image
Translation [170.08142745705575]
The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style.
Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation.
We introduce a novel approach, Global and Local Alignment Networks (GLA-Net)
Our method effectively generates sharper and more realistic images than existing approaches.
arXiv Detail & Related papers (2021-11-19T18:01:54Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z) - Molecule Property Prediction and Classification with Graph Hypernetworks [113.38181979662288]
We show that the replacement of the underlying networks with hypernetworks leads to a boost in performance.
A major difficulty in the application of hypernetworks is their lack of stability.
A recent work has tackled the training instability of hypernetworks in the context of error correcting codes.
arXiv Detail & Related papers (2020-02-01T16:44:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.