Learning Structure-from-Motion with Graph Attention Networks
- URL: http://arxiv.org/abs/2308.15984v3
- Date: Sat, 18 May 2024 22:44:57 GMT
- Title: Learning Structure-from-Motion with Graph Attention Networks
- Authors: Lucas Brynte, José Pedro Iglesias, Carl Olsson, Fredrik Kahl,
- Abstract summary: We tackle the problem of learning Structure-from-Motion (SfM) through the use of graph attention networks.
In this work we learn a model that takes as input the 2D keypoints detected across multiple views, and outputs the corresponding camera poses and 3D keypoint coordinates.
Our model takes advantage of graph neural networks to learn SfM-specific primitives, and we show that it can be used for fast inference of the reconstruction for new and unseen sequences.
- Score: 23.87562683118926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we tackle the problem of learning Structure-from-Motion (SfM) through the use of graph attention networks. SfM is a classic computer vision problem that is solved though iterative minimization of reprojection errors, referred to as Bundle Adjustment (BA), starting from a good initialization. In order to obtain a good enough initialization to BA, conventional methods rely on a sequence of sub-problems (such as pairwise pose estimation, pose averaging or triangulation) which provide an initial solution that can then be refined using BA. In this work we replace these sub-problems by learning a model that takes as input the 2D keypoints detected across multiple views, and outputs the corresponding camera poses and 3D keypoint coordinates. Our model takes advantage of graph neural networks to learn SfM-specific primitives, and we show that it can be used for fast inference of the reconstruction for new and unseen sequences. The experimental results show that the proposed model outperforms competing learning-based methods, and challenges COLMAP while having lower runtime. Our code is available at https://github.com/lucasbrynte/gasfm/.
Related papers
- SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method.
We distribute features of space-time tubes evenly across a limited number of learnable clusters.
Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z) - DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly [21.497180110855975]
We introduce DiffAssemble, a Graph Neural Network (GNN)-based architecture that learns to solve reassembly tasks.
Our method treats the elements of a set, whether pieces of 2D patch or 3D object fragments, as nodes of a spatial graph.
We highlight its remarkable reduction in run-time, performing 11 times faster than the quickest optimization-based method for puzzle solving.
arXiv Detail & Related papers (2024-02-29T16:09:12Z) - Fine Structure-Aware Sampling: A New Sampling Training Scheme for Pixel-Aligned Implicit Models in Single-View Human Reconstruction [98.30014795224432]
We introduce Fine Structured-Aware Sampling (FSS) to train pixel-aligned implicit models for single-view human reconstruction.
FSS proactively adapts to the thickness and complexity of surfaces.
It also proposes a mesh thickness loss signal for pixel-aligned implicit models.
arXiv Detail & Related papers (2024-02-29T14:26:46Z) - Determination of the critical points for systems of directed percolation
class using machine learning [0.0]
We use CNN and DBSCAN in order to determine the critical points for directed bond percolation (bond DP) model and Domany-Kinzel cellular universality (DK) model.
Our results from both algorithms show that, even for a very small values of lattice size, machine can predict the critical points accurately for both models.
arXiv Detail & Related papers (2023-07-19T20:58:12Z) - Generalizing Neural Human Fitting to Unseen Poses With Articulated SE(3)
Equivariance [48.39751410262664]
ArtEq is a part-based SE(3)-equivariant neural architecture for SMPL model estimation from point clouds.
Experimental results show that ArtEq generalizes to poses not seen during training, outperforming state-of-the-art methods by 44% in terms of body reconstruction accuracy.
arXiv Detail & Related papers (2023-04-20T17:58:26Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - FvOR: Robust Joint Shape and Pose Optimization for Few-view Object
Reconstruction [37.81077373162092]
Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision.
We present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses.
arXiv Detail & Related papers (2022-05-16T15:39:27Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Towards a method to anticipate dark matter signals with deep learning at
the LHC [58.720142291102135]
We study several simplified dark matter (DM) models and their signatures at the LHC using neural networks.
We focus on the usual monojet plus missing transverse energy channel, but to train the algorithms we organize the data in 2D histograms instead of event-by-event arrays.
This results in a large performance boost to distinguish between standard model (SM) only and SM plus new physics signals.
arXiv Detail & Related papers (2021-05-25T15:38:13Z) - A generalized quadratic loss for SVM and Deep Neural Networks [0.0]
We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibit the best generalization performances.
We extend the work [3] on a generalized quadratic loss for learning problems that examines pattern correlations in order to concentrate the learning problem into input space regions where patterns are more densely distributed.
arXiv Detail & Related papers (2021-02-15T15:49:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.