Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs
- URL: http://arxiv.org/abs/2203.15720v1
- Date: Tue, 29 Mar 2022 16:24:52 GMT
- Title: Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs
- Authors: Yifeng Jiang, Yuting Ye, Deepak Gopinath, Jungdam Won, Alexander W.
Winkler, C. Karen Liu
- Abstract summary: We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
- Score: 79.72586714047199
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Real-time human motion reconstruction from a sparse set of wearable IMUs
provides an non-intrusive and economic approach to motion capture. Without the
ability to acquire absolute position information using IMUs, many prior works
took data-driven approaches that utilize large human motion datasets to tackle
the under-determined nature of the problem. Still, challenges such as temporal
consistency, global translation estimation, and diverse coverage of motion or
terrain types remain. Inspired by recent success of Transformer models in
sequence modeling, we propose an attention-based deep learning method to
reconstruct full-body motion from six IMU sensors in real-time. Together with a
physics-based learning objective to predict "stationary body points", our
method achieves new state-of-the-art results both quantitatively and
qualitatively, while being simple to implement and smaller in size. We evaluate
our method extensively on synthesized and real IMU data, and with real-time
live demos.
Related papers
- Stereo-Inertial Poser: Towards Metric-Accurate Shape-Aware Motion Capture Using Sparse IMUs and a Single Stereo Camera [54.967647497048205]
We present Stereo-Inertial Poser, a real-time motion capture system that estimates metric-accurate and shape-aware 3D human motion.<n>We replace the monocular RGB with stereo vision, enabling direct 3D keypoint extraction and body shape parameter estimation.<n>Our method produces drift-free global translation under a long recording time and reduces foot-skating effects.
arXiv Detail & Related papers (2026-03-02T17:46:38Z) - D-REX: Differentiable Real-to-Sim-to-Real Engine for Learning Dexterous Grasping [66.22412592525369]
We introduce a real-to-sim-to-real engine that leverages the Gaussian Splat representations to build a differentiable engine.<n>We show that our engine achieves accurate and robust performance in mass identification across various object geometries and mass values.<n>Those optimized mass values facilitate force-aware policy learning, achieving superior and high performance in object grasping.
arXiv Detail & Related papers (2026-03-01T15:32:04Z) - MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction [54.36564144414704]
MeshMimic is an innovative framework that bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn coupled "motion-terrain" interactions directly from video.<n>By leveraging state-of-the-art 3D vision models, our framework precisely segments and reconstructs both human trajectories and the underlying 3D geometry of terrains and objects.
arXiv Detail & Related papers (2026-02-17T17:09:45Z) - ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning [59.64325421657381]
Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks.<n>We introduce ResMimic, a two-stage residual learning framework for precise and expressive humanoid control from human motion data.<n>Results show substantial gains in task success, training efficiency, and robustness over strong baselines.
arXiv Detail & Related papers (2025-10-06T17:47:02Z) - BaroPoser: Real-time Human Motion Tracking from IMUs and Barometers in Everyday Devices [12.374794959250828]
We present BaroPoser, the first method that combines IMU and barometric data recorded by a smartphone and a smartwatch to estimate human pose and global translation in real time.<n>By leveraging barometric readings, we estimate sensor height changes, which provide valuable cues for both improving the accuracy of human pose estimation and predicting global translation on non-flat terrain.
arXiv Detail & Related papers (2025-08-05T10:46:59Z) - SSSUMO: Real-Time Semi-Supervised Submovement Decomposition [0.6499759302108926]
Submovement analysis offers valuable insights into motor control.<n>Existing methods struggle with reconstruction accuracy, computational cost, and validation.<n>We address these challenges using a semi-supervised learning framework.
arXiv Detail & Related papers (2025-07-08T21:26:25Z) - Human Motion Capture from Loose and Sparse Inertial Sensors with Garment-aware Diffusion Models [25.20942802233326]
We present a new task of full-body human pose estimation using sparse, loosely attached IMU sensors.<n>We developed transformer-based diffusion models to synthesize loose IMU data and estimate human poses based on this challenging loose IMU data.
arXiv Detail & Related papers (2025-06-18T09:16:36Z) - A Data-driven Crowd Simulation Framework Integrating Physics-informed Machine Learning with Navigation Potential Fields [15.429885272765363]
We propose a novel data-driven crowd simulation framework that integrates Physics-informed Machine Learning (PIML) with navigation potential fields.
Specifically, we design an innovative Physics-informed S-temporal Graph Convolutional Network (PI-STGCN) as a data-driven module to predict pedestrian movement trends.
In our framework, navigation potential fields are dynamically computed and updated based on the movement trends predicted by the PI-STGCN.
arXiv Detail & Related papers (2024-10-21T15:56:17Z) - Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition [24.217068565936117]
We present a novel method for action recognition that integrates motion data from body-worn IMUs with egocentric video.
To model the complex relation of multiple IMU devices placed across the body, we exploit the collaborative dynamics in multiple IMU devices.
Experiments show our method can achieve state-of-the-art performance on multiple public datasets.
arXiv Detail & Related papers (2024-07-09T07:53:16Z) - Scaling Up Dynamic Human-Scene Interaction Modeling [58.032368564071895]
TRUMANS is the most comprehensive motion-captured HSI dataset currently available.
It intricately captures whole-body human motions and part-level object dynamics.
We devise a diffusion-based autoregressive model that efficiently generates HSI sequences of any length.
arXiv Detail & Related papers (2024-03-13T15:45:04Z) - LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
Environment [59.320414108383055]
We present LiveHPS, a novel single-LiDAR-based approach for scene-level human pose and shape estimation.
We propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses.
arXiv Detail & Related papers (2024-02-27T03:08:44Z) - Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted
with Textual Semantics [4.9493039356268875]
Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique.
In this paper, we explore the spatial importance of multiple sensors, supervised by text that describes specific actions.
With textual supervision, our method not only differentiates between ambiguous actions such as sitting and standing but also produces more precise and natural motion.
arXiv Detail & Related papers (2023-12-27T04:21:45Z) - SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data [1.494051815405093]
We introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from sparse data.
Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses.
We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices.
arXiv Detail & Related papers (2023-11-03T18:48:01Z) - Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular
Depth Estimation by Integrating IMU Motion Dynamics [74.1720528573331]
Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years.
We propose DynaDepth, a novel scale-aware framework that integrates information from vision and IMU motion dynamics.
We validate the effectiveness of DynaDepth by conducting extensive experiments and simulations on the KITTI and Make3D datasets.
arXiv Detail & Related papers (2022-07-11T07:50:22Z) - Motion Prediction via Joint Dependency Modeling in Phase Space [40.54430409142653]
We introduce a novel convolutional neural model to leverage explicit prior knowledge of motion anatomy.
We then propose a global optimization module that learns the implicit relationships between individual joint features.
Our method is evaluated on large-scale 3D human motion benchmark datasets.
arXiv Detail & Related papers (2022-01-07T08:30:01Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.