Related papers: Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF

Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF

URL: http://arxiv.org/abs/2006.10178v3
Date: Mon, 15 Mar 2021 17:11:08 GMT
Title: Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF
Authors: Atanas Mirchev, Baris Kayalibay, Patrick van der Smagt and Justin Bayer
Abstract summary: We solve the problem of 6-DoF localisation and 3D dense reconstruction in spatial environments as approximate Bayesian inference in a deep state-space model. This results in an expressive predictive model of the world, often missing in current state-of-the-art visual SLAM solutions. We evaluate our approach on realistic unmanned aerial vehicle flight data, nearing the performance of state-of-the-art visual-inertial odometry systems.
Score: 17.698319441265223
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We solve the problem of 6-DoF localisation and 3D dense reconstruction in spatial environments as approximate Bayesian inference in a deep state-space model. Our approach leverages both learning and domain knowledge from multiple-view geometry and rigid-body dynamics. This results in an expressive predictive model of the world, often missing in current state-of-the-art visual SLAM solutions. The combination of variational inference, neural networks and a differentiable raycaster ensures that our model is amenable to end-to-end gradient-based optimisation. We evaluate our approach on realistic unmanned aerial vehicle flight data, nearing the performance of state-of-the-art visual-inertial odometry systems. We demonstrate the applicability of the model to generative prediction and planning.

Related papers

Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z)
An Efficient Occupancy World Model via Decoupled Dynamic Flow and Image-assisted Training [50.71892161377806]
DFIT-OccWorld is an efficient 3D occupancy world model that leverages decoupled dynamic flow and image-assisted training strategy. Our model forecasts future dynamic voxels by warping existing observations using voxel flow, whereas static voxels are easily obtained through pose transformation.
arXiv Detail & Related papers (2024-12-18T12:10:33Z)
3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction [50.07071392673984]
Existing methods learn 3D rotations parametrized in the spatial domain using angles or quaternions. We propose a frequency-domain approach that directly predicts Wigner-D coefficients for 3D rotation regression. Our method achieves state-of-the-art results on benchmarks such as ModelNet10-SO(3) and PASCAL3D+.
arXiv Detail & Related papers (2024-11-01T12:50:38Z)
VortSDF: 3D Modeling with Centroidal Voronoi Tesselation on Signed Distance Field [5.573454319150408]
We introduce a volumetric optimization framework that combines explicit SDF fields with a shallow color network, in order to estimate 3D shape properties over tetrahedral grids. Experimental results with Chamfer statistics validate this approach with unprecedented reconstruction quality on various scenarios such as objects, open scenes or human.
arXiv Detail & Related papers (2024-07-29T09:46:39Z)
Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level. The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z)
SeisFusion: Constrained Diffusion Model with Input Guidance for 3D Seismic Data Interpolation and Reconstruction [26.02191880837226]
We propose a novel diffusion model reconstruction framework tailored for 3D seismic data. We introduce a 3D neural network architecture into the diffusion model, successfully extending the 2D diffusion model to 3D space. Our method exhibits superior reconstruction accuracy when applied to both field datasets and synthetic datasets.
arXiv Detail & Related papers (2024-03-18T05:10:13Z)
Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans. Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art. LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z)
Stereo Neural Vernier Caliper [57.187088191829886]
We propose a new object-centric framework for learning-based stereo 3D object detection. We tackle a problem of how to predict a refined update given an initial 3D cuboid guess. Our approach achieves state-of-the-art performance on the KITTI benchmark.
arXiv Detail & Related papers (2022-03-21T14:36:07Z)
A Model for Multi-View Residual Covariances based on Perspective Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups. We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z)
Tracking and Planning with Spatial World Models [17.698319441265223]
We introduce a method for real-time navigation and tracking with differentiably rendered world models. We achieve up to 92% navigation success rate at a frequency of 15 Hz using only image and depth observations.
arXiv Detail & Related papers (2022-01-25T14:16:46Z)
Extracting Global Dynamics of Loss Landscape in Deep Learning Models [0.0]
We present a toolkit for the Dynamical Organization Of Deep Learning Loss Landscapes, or DOODL3. DOODL3 formulates the training of neural networks as a dynamical system, analyzes the learning process, and presents an interpretable global view of trajectories in the loss landscape.
arXiv Detail & Related papers (2021-06-14T18:07:05Z)
Iterative Optimisation with an Innovation CNN for Pose Refinement [17.752556490937092]
In this work we propose an approach, namely an Innovation CNN, to object pose estimation refinement. Our approach improves initial pose estimation progressively by applying the Innovation CNN iteratively in a gradient descent framework. We evaluate our method on the popular LINEMOD and Occlusion LINEMOD datasets and obtain state-of-the-art performance on both datasets.
arXiv Detail & Related papers (2021-01-22T00:12:12Z)
Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image. We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.