IONext: Unlocking the Next Era of Inertial Odometry
- URL: http://arxiv.org/abs/2507.17089v1
- Date: Wed, 23 Jul 2025 00:09:36 GMT
- Title: IONext: Unlocking the Next Era of Inertial Odometry
- Authors: Shanshan Zhang, Siyue Wang, Tianshui Wen, Qi Zhang, Ziheng Zhou, Lingxiang Zheng, Yu Yang,
- Abstract summary: We present a new CNN-based inertial odometry backbone, named Next Era of Inertial Odometry (IONext)<n>IONext consistently outperforms state-of-the-art (SOTA) Transformer- and CNN-based methods.<n>For instance, on the RNIN dataset, IONext reduces the average ATE by 10% and the average RTE by 12% compared to the representative model iMOT.
- Score: 24.137981640306034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Researchers have increasingly adopted Transformer-based models for inertial odometry. While Transformers excel at modeling long-range dependencies, their limited sensitivity to local, fine-grained motion variations and lack of inherent inductive biases often hinder localization accuracy and generalization. Recent studies have shown that incorporating large-kernel convolutions and Transformer-inspired architectural designs into CNN can effectively expand the receptive field, thereby improving global motion perception. Motivated by these insights, we propose a novel CNN-based module called the Dual-wing Adaptive Dynamic Mixer (DADM), which adaptively captures both global motion patterns and local, fine-grained motion features from dynamic inputs. This module dynamically generates selective weights based on the input, enabling efficient multi-scale feature aggregation. To further improve temporal modeling, we introduce the Spatio-Temporal Gating Unit (STGU), which selectively extracts representative and task-relevant motion features in the temporal domain. This unit addresses the limitations of temporal modeling observed in existing CNN approaches. Built upon DADM and STGU, we present a new CNN-based inertial odometry backbone, named Next Era of Inertial Odometry (IONext). Extensive experiments on six public datasets demonstrate that IONext consistently outperforms state-of-the-art (SOTA) Transformer- and CNN-based methods. For instance, on the RNIN dataset, IONext reduces the average ATE by 10% and the average RTE by 12% compared to the representative model iMOT.
Related papers
- Langevin Flows for Modeling Neural Latent Dynamics [81.81271685018284]
We introduce LangevinFlow, a sequential Variational Auto-Encoder where the time evolution of latent variables is governed by the underdamped Langevin equation.<n>Our approach incorporates physical priors -- such as inertia, damping, a learned potential function, and forces -- to represent both autonomous and non-autonomous processes in neural systems.<n>Our method outperforms state-of-the-art baselines on synthetic neural populations generated by a Lorenz attractor.
arXiv Detail & Related papers (2025-07-15T17:57:48Z) - T-SHRED: Symbolic Regression for Regularization and Model Discovery with Transformer Shallow Recurrent Decoders [2.8820361301109365]
SHallow REcurrent Decoders (SHRED) are effective for system identification and forecasting from sparse sensor measurements.<n>We improve SHRED by leveraging transformers (T-SHRED) for the temporal encoding which improves performance on next-step state prediction.<n> Symbolic regression improves model interpretability by learning and regularizing the dynamics of the latent space during training.
arXiv Detail & Related papers (2025-06-18T21:14:38Z) - Multi-Head Self-Attending Neural Tucker Factorization [5.734615417239977]
We introduce a neural network-based tensor factorization approach tailored for learning representations of high-dimensional and incomplete (HDI) tensors.<n>The proposed MSNTucF model demonstrates superior performance compared to state-of-the-art benchmark models in estimating missing observations.
arXiv Detail & Related papers (2025-01-16T13:04:15Z) - A Multi-Layer CNN-GRUSKIP model based on transformer for spatial TEMPORAL traffic flow prediction [0.06597195879147556]
Traffic flow prediction remains a cornerstone for intelligent transportation systems ITS.<n>The CNN-GRUSKIP model emerges as pioneering approach.<n>The model consistently outperformed established models such as ARIMA, Graph Wave Net, HA, LSTM, STGCN, and APT.<n>With its potent predictive prowess and adaptive architecture, the CNN-GRUSKIP model stands to redefine ITS applications.
arXiv Detail & Related papers (2025-01-09T21:30:02Z) - Equivariant Graph Neural Operator for Modeling 3D Dynamics [148.98826858078556]
We propose Equivariant Graph Neural Operator (EGNO) to directly models dynamics as trajectories instead of just next-step prediction.
EGNO explicitly learns the temporal evolution of 3D dynamics where we formulate the dynamics as a function over time and learn neural operators to approximate it.
Comprehensive experiments in multiple domains, including particle simulations, human motion capture, and molecular dynamics, demonstrate the significantly superior performance of EGNO against existing methods.
arXiv Detail & Related papers (2024-01-19T21:50:32Z) - Event-based Shape from Polarization with Spiking Neural Networks [5.200503222390179]
We introduce the Single-Timestep and Multi-Timestep Spiking UNets for effective and efficient surface normal estimation.
Our work contributes to the advancement of SNNs in event-based sensing.
arXiv Detail & Related papers (2023-12-26T14:43:26Z) - EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via
Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes.
It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping.
Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z) - Generative Modeling with Phase Stochastic Bridges [49.4474628881673]
Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs.
We introduce a novel generative modeling framework grounded in textbfphase space dynamics
Our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.
arXiv Detail & Related papers (2023-10-11T18:38:28Z) - Disentangling Structured Components: Towards Adaptive, Interpretable and
Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns.
SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns.
Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z) - REMuS-GNN: A Rotation-Equivariant Model for Simulating Continuum
Dynamics [0.0]
We introduce REMuS-GNN, a rotation-equivariant multi-scale model for simulating continuum dynamical systems.
We demonstrate and evaluate this method on the incompressible flow around elliptical cylinders.
arXiv Detail & Related papers (2022-05-05T16:20:37Z) - A Generative Learning Approach for Spatio-temporal Modeling in Connected
Vehicular Network [55.852401381113786]
This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive-temporal quality framework for wireless access latency of connected vehicles.
LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure.
In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Varienational Autocoder (VAE)
arXiv Detail & Related papers (2020-03-16T03:43:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.