Related papers: DefVINS: Visual-Inertial Odometry for Deformable Scenes

DefVINS: Visual-Inertial Odometry for Deformable Scenes

URL: http://arxiv.org/abs/2601.00702v1
Date: Fri, 02 Jan 2026 14:40:33 GMT
Title: DefVINS: Visual-Inertial Odometry for Deformable Scenes
Authors: Samuel Cerezo, Javier Civera,
Abstract summary: Deformable scenes violate the rigidity assumptions underpinning visual-inertial odometry.<n>We introduce DefVINS, a visual-inertial odometry framework that separates a rigid, IMU-anchored state from a non-rigid warp.
Score: 14.028399155214068
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deformable scenes violate the rigidity assumptions underpinning classical visual-inertial odometry (VIO), often leading to over-fitting to local non-rigid motion or severe drift when deformation dominates visual parallax. We introduce DefVINS, a visual-inertial odometry framework that explicitly separates a rigid, IMU-anchored state from a non--rigid warp represented by an embedded deformation graph. The system is initialized using a standard VIO procedure that fixes gravity, velocity, and IMU biases, after which non-rigid degrees of freedom are activated progressively as the estimation becomes well conditioned. An observability analysis is included to characterize how inertial measurements constrain the rigid motion and render otherwise unobservable modes identifiable in the presence of deformation. This analysis motivates the use of IMU anchoring and informs a conditioning-based activation strategy that prevents ill-posed updates under poor excitation. Ablation studies demonstrate the benefits of combining inertial constraints with observability-aware deformation activation, resulting in improved robustness under non-rigid environments.

Related papers

Same Answer, Different Representations: Hidden instability in VLMs [65.36933543377346]
We introduce a representation-aware and frequency-aware evaluation framework that measures internal embedding drift, spectral sensitivity, and structural smoothness.<n>We apply this framework to modern Vision Language Models (VLMs) across the SEEDBench, MMMU, and POPE datasets.
arXiv Detail & Related papers (2026-02-06T12:24:26Z)
Adaptive Visual Autoregressive Acceleration via Dual-Linkage Entropy Analysis [50.48301331112126]
We propose NOVA, a training-free token reduction acceleration framework for Visual AutoRegressive modeling.<n>NOVA adaptively determines the acceleration activation scale during inference by online identifying the inflection point of scale entropy growth.<n>Experiments and analyses validate NOVA as a simple yet effective training-free acceleration framework.
arXiv Detail & Related papers (2026-02-01T17:29:42Z)
GO-OSC and VASH: Geometry-Aware Representation Learning for Early Degradation Detection in Oscillatory Systems [0.0]
We introduce GO-OSC, a geometry-aware representation learning framework for oscillatory time series.<n>We show that under early phase-only degradation, energy-based statistics have zero first-order detection power, whereas geometric probes achieve strictly positive sensitivity.<n>Our analysis characterizes when and why linear probing fails under non-identifiable representations and shows how canonicalization restores statistical detectability.
arXiv Detail & Related papers (2026-01-24T09:35:57Z)
Machine learning assisted state prediction of misspecified linear dynamical system via modal reduction [0.0]
Parametric models with fixed nominal parameters often omit critical physical effects due to simplifications in geometry, material behavior, damping, or boundary conditions.<n>This work introduces a comprehensive framework for MFE estimation and correction in high-dimensional finite element based structural dynamical systems.<n>To ensure computational tractability, the FE system is projected onto a reduced modal basis, and a mesh-invariant neural network maps modal states to discrepancy estimates.
arXiv Detail & Related papers (2026-01-08T10:14:27Z)
SIGMA: Scalable Spectral Insights for LLM Collapse [51.863164847253366]
We introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework for model collapse.<n>By utilizing benchmarks that deriving and deterministic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space.<n>We demonstrate that SIGMA effectively captures the transition towards states, offering both theoretical insights into the mechanics of collapse.
arXiv Detail & Related papers (2026-01-06T19:47:11Z)
PIS: A Generalized Physical Inversion Solver for Arbitrary Sparse Observations via Set-Conditioned Diffusion [1.7257650649008898]
We propose a set-conditioned diffusion framework enabling inversion from truly arbitrary observation sets.<n>PIS employs a Set Transformer-based encoder to handle measurements of any number or geometry, and a cosine-annealed sparsity curriculum for exceptional robustness.<n>PIS is evaluated on three challenging PDE inverse problems: Darcy flow, wavefield inversion (Helmholtz), and structural health monitoring (Hooke's Law)
arXiv Detail & Related papers (2025-12-14T06:28:55Z)
LyTimeT: Towards Robust and Interpretable State-Variable Discovery [7.505092370141079]
LyTimeT is a framework for interpretable variable extraction.<n>It learns robust and stable latent representations of dynamical video systems.<n>Our results demonstrate that combiningtemporal attention with stability constraints yields predictive models.
arXiv Detail & Related papers (2025-10-22T16:03:10Z)
Detection and Recovery of Adversarial Slow-Pose Drift in Offloaded Visual-Inertial Odometry [0.0]
Current trend of offloading VIO to edge servers can lead server-side threat surface.<n>We present an unsupervised, label-free detection and recovery mechanism.<n>We evaluate the approach in a realistic offloaded-VIO environment using ILLIXR testbed.
arXiv Detail & Related papers (2025-09-08T18:31:40Z)
MASIV: Toward Material-Agnostic System Identification from Videos [76.36666848173141]
MASIV is a vision-based framework for material-agnostic system identification.<n>It employs learnable neural models, inferring object dynamics without assuming a scene-specific material prior.<n>It achieves state-of-the-art performance in geometric accuracy, rendering quality, and generalization ability.
arXiv Detail & Related papers (2025-08-01T23:23:45Z)
Solving Inverse Problems with FLAIR [68.87167940623318]
We present FLAIR, a training-free variational framework that leverages flow-based generative models as prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z)
A Plug-and-Play Learning-based IMU Bias Factor for Robust Visual-Inertial Odometry [27.62788405443008]
We propose a novel plug-and-play module featuring the Inertial Prior Network (IPNet)<n>IPNet infers an IMU bias prior by implicitly capturing the motion characteristics of specific platforms.<n>In this work, we first directly infer the biases prior only using the raw IMU data using a sliding window approach.
arXiv Detail & Related papers (2025-03-16T14:45:19Z)
DynaVINS++: Robust Visual-Inertial State Estimator in Dynamic Environments by Adaptive Truncated Least Squares and Stable State Recovery [11.37707868611451]
We propose a robust VINS framework called mboxtextitDynaVINS++. Our approach shows promising performance in dynamic environments, including scenes with abruptly dynamic objects.
arXiv Detail & Related papers (2024-10-20T12:13:45Z)
Extreme Miscalibration and the Illusion of Adversarial Robustness [66.29268991629085]
Adversarial Training is often used to increase model robustness. We show that this observed gain in robustness is an illusion of robustness (IOR) We urge the NLP community to incorporate test-time temperature scaling into their robustness evaluations.
arXiv Detail & Related papers (2024-02-27T13:49:12Z)
Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics [74.1720528573331]
Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years. We propose DynaDepth, a novel scale-aware framework that integrates information from vision and IMU motion dynamics. We validate the effectiveness of DynaDepth by conducting extensive experiments and simulations on the KITTI and Make3D datasets.
arXiv Detail & Related papers (2022-07-11T07:50:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.