Fully Differentiable and Interpretable Model for VIO with 4 Trainable
Parameters
- URL: http://arxiv.org/abs/2109.12292v1
- Date: Sat, 25 Sep 2021 06:54:09 GMT
- Title: Fully Differentiable and Interpretable Model for VIO with 4 Trainable
Parameters
- Authors: Zexi Chen, Haozhe Du, Yiyi Liao, Yue Wang, Rong Xiong
- Abstract summary: Monocular visual-inertial odometry is a critical problem in robotics and autonomous driving.
In this paper, we propose a fully differentiable, interpretable, and lightweight monocular VIO model that contains only 4 trainable parameters.
Experimental results on synthetic and real-world datasets demonstrate that our simple approach is competitive with state-of-the-art methods.
- Score: 16.347927939872488
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Monocular visual-inertial odometry (VIO) is a critical problem in robotics
and autonomous driving. Traditional methods solve this problem based on
filtering or optimization. While being fully interpretable, they rely on manual
interference and empirical parameter tuning. On the other hand, learning-based
approaches allow for end-to-end training but require a large number of training
data to learn millions of parameters. However, the non-interpretable and heavy
models hinder the generalization ability. In this paper, we propose a fully
differentiable, interpretable, and lightweight monocular VIO model that
contains only 4 trainable parameters. Specifically, we first adopt Unscented
Kalman Filter as a differentiable layer to predict the pitch and roll, where
the covariance matrices of noise are learned to filter out the noise of the IMU
raw data. Second, the refined pitch and roll are adopted to retrieve a
gravity-aligned BEV image of each frame using differentiable camera projection.
Finally, a differentiable pose estimator is utilized to estimate the remaining
4 DoF poses between the BEV frames. Our method allows for learning the
covariance matrices end-to-end supervised by the pose estimation loss,
demonstrating superior performance to empirical baselines. Experimental results
on synthetic and real-world datasets demonstrate that our simple approach is
competitive with state-of-the-art methods and generalizes well on unseen
scenes.
Related papers
- 4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration [31.111439909825627]
Existing methods typically model the dataset's action distribution using simple observations as inputs.<n>We propose 4D-VLA, a novel approach that effectively integrates 4D information into the input to these sources of chaos.<n>Our model consistently outperforms existing methods, demonstrating stronger spatial understanding and adaptability.
arXiv Detail & Related papers (2025-06-27T14:09:29Z) - Reasoning and Learning a Perceptual Metric for Self-Training of Reflective Objects in Bin-Picking with a Low-cost Camera [10.976379239028455]
Bin-picking of metal objects using low-cost RGB-D cameras often suffers from sparse depth information and reflective surface textures.
We propose a two-stage framework consisting of a metric learning stage and a self-training stage.
Our approach outperforms several state-of-the-art methods on both the ROBI dataset and our newly introduced Self-ROBI dataset.
arXiv Detail & Related papers (2025-03-26T04:03:51Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation [30.710296843150832]
Estimating relative camera poses between images has been a central problem in computer vision.
We show how to combine the best of both methods; our approach yields results that are both precise and robust.
A comprehensive analysis supports our design choices and demonstrates that our method adapts flexibly to various feature extractors and correspondence estimators.
arXiv Detail & Related papers (2024-03-05T18:59:51Z) - NPEFF: Non-Negative Per-Example Fisher Factorization [52.44573961263344]
We introduce a novel interpretability method called NPEFF that is readily applicable to any end-to-end differentiable model.
We demonstrate that NPEFF has interpretable tunings through experiments on language and vision models.
arXiv Detail & Related papers (2023-10-07T02:02:45Z) - Particle-Based Score Estimation for State Space Model Learning in
Autonomous Driving [62.053071723903834]
Multi-object state estimation is a fundamental problem for robotic applications.
We consider learning maximum-likelihood parameters using particle methods.
We apply our method to real data collected from autonomous vehicles.
arXiv Detail & Related papers (2022-12-14T01:21:05Z) - Multi-View Object Pose Refinement With Differentiable Renderer [22.040014384283378]
This paper introduces a novel multi-view 6 DoF object pose refinement approach focusing on improving methods trained on synthetic data.
It is based on the DPOD detector, which produces dense 2D-3D correspondences between the model vertices and the image pixels in each frame.
We report excellent performance in comparison to the state-of-the-art methods trained on the synthetic and real data.
arXiv Detail & Related papers (2022-07-06T17:02:22Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Learn from Unpaired Data for Image Restoration: A Variational Bayes
Approach [18.007258270845107]
We propose LUD-VAE, a deep generative method to learn the joint probability density function from data sampled from marginal distributions.
We apply our method to real-world image denoising and super-resolution tasks and train the models using the synthetic data generated by the LUD-VAE.
arXiv Detail & Related papers (2022-04-21T13:27:17Z) - Adaptive Multi-View ICA: Estimation of noise levels for optimal
inference [65.94843987207445]
Adaptive multiView ICA (AVICA) is a noisy ICA model where each view is a linear mixture of shared independent sources with additive noise on the sources.
On synthetic data, AVICA yields better sources estimates than other group ICA methods thanks to its explicit MMSE estimator.
On real magnetoencephalograpy (MEG) data, we provide evidence that the decomposition is less sensitive to sampling noise and that the noise variance estimates are biologically plausible.
arXiv Detail & Related papers (2021-02-22T13:10:12Z) - Variational Bayesian Unlearning [54.26984662139516]
We study the problem of approximately unlearning a Bayesian model from a small subset of the training data to be erased.
We show that it is equivalent to minimizing an evidence upper bound which trades off between fully unlearning from erased data vs. not entirely forgetting the posterior belief.
In model training with VI, only an approximate (instead of exact) posterior belief given the full data can be obtained, which makes unlearning even more challenging.
arXiv Detail & Related papers (2020-10-24T11:53:00Z) - Doubly Robust Semiparametric Difference-in-Differences Estimators with
High-Dimensional Data [15.27393561231633]
We propose a doubly robust two-stage semiparametric difference-in-difference estimator for estimating heterogeneous treatment effects.
The first stage allows a general set of machine learning methods to be used to estimate the propensity score.
In the second stage, we derive the rates of convergence for both the parametric parameter and the unknown function.
arXiv Detail & Related papers (2020-09-07T15:14:29Z) - Variational Inference with Parameter Learning Applied to Vehicle
Trajectory Estimation [20.41604350878599]
We present parameter learning in a Gaussian variational inference setting using only noisy measurements.
We demonstrate our technique using a 36km dataset consisting of a car using lidar to localize against a high-definition map.
arXiv Detail & Related papers (2020-03-21T19:48:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.