MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices
- URL: http://arxiv.org/abs/2504.12492v1
- Date: Wed, 16 Apr 2025 21:19:47 GMT
- Title: MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices
- Authors: Vasco Xu, Chenfeng Gao, Henry Hoffmann, Karan Ahuja,
- Abstract summary: We introduce MobilePoser, a real-time system for full-body pose and global translation estimation.<n>MobilePoser employs a physics-based motion estimation followed by a deep neural network for pose estimation, achieving state-of-the-art accuracy while remaining lightweight.<n>We conclude with a series of applications to illustrate the unique potential of MobilePoser across a variety of fields, such as health and wellness, gaming, and indoor navigation to name a few.
- Score: 9.50274333425178
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been a continued trend towards minimizing instrumentation for full-body motion capture, going from specialized rooms and equipment, to arrays of worn sensors and recently sparse inertial pose capture methods. However, as these techniques migrate towards lower-fidelity IMUs on ubiquitous commodity devices, like phones, watches, and earbuds, challenges arise including compromised online performance, temporal consistency, and loss of global translation due to sensor noise and drift. Addressing these challenges, we introduce MobilePoser, a real-time system for full-body pose and global translation estimation using any available subset of IMUs already present in these consumer devices. MobilePoser employs a multi-stage deep neural network for kinematic pose estimation followed by a physics-based motion optimizer, achieving state-of-the-art accuracy while remaining lightweight. We conclude with a series of demonstrative applications to illustrate the unique potential of MobilePoser across a variety of fields, such as health and wellness, gaming, and indoor navigation to name a few.
Related papers
- Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input [62.51283548975632]
This work focuses on tracking and understanding human motion using consumer wearable devices, such as VR/AR headsets, smart glasses, cellphones, and smartwatches.<n>We present Ego4o (o for omni), a new framework for simultaneous human motion capture and understanding from multi-modal egocentric inputs.
arXiv Detail & Related papers (2025-04-11T11:18:57Z) - Suite-IN: Aggregating Motion Features from Apple Suite for Robust Inertial Navigation [10.634236058278722]
Motion data captured by sensors on different body parts contains both local and global motion information.
We propose a multi-device deep learning framework named Suite-IN, aggregating motion data from Apple Suite for inertial navigation.
arXiv Detail & Related papers (2024-11-12T14:23:52Z) - SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data [1.494051815405093]
We introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from sparse data.
Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses.
We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices.
arXiv Detail & Related papers (2023-11-03T18:48:01Z) - Utilizing Task-Generic Motion Prior to Recover Full-Body Motion from
Very Sparse Signals [3.8079353598215757]
We propose a method that utilizes information from a neural motion prior to improve the accuracy of reconstructed user's motions.
This is based on the premise that the ultimate goal of pose reconstruction is to reconstruct the motion, which is a series of poses.
arXiv Detail & Related papers (2023-08-30T08:21:52Z) - IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and
Earbuds [41.8359507387665]
We explore the feasibility of estimating body pose using IMUs already in devices that many users own.
Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose.
arXiv Detail & Related papers (2023-04-25T02:13:24Z) - HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for
Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians.
Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin.
Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z) - A Flexible-Frame-Rate Vision-Aided Inertial Object Tracking System for
Mobile Devices [3.4836209951879957]
We propose a flexible-frame-rate object pose estimation and tracking system for mobile devices.
Inertial measurement unit (IMU) pose propagation is performed on the client side for high speed tracking, and RGB image-based 3D pose estimation is performed on the server side.
Our system supports flexible frame rates up to 120 FPS and guarantees high precision and real-time tracking on low-end devices.
arXiv Detail & Related papers (2022-10-22T15:26:50Z) - QuestSim: Human Motion Tracking from Sparse Sensors with Simulated
Avatars [80.05743236282564]
Real-time tracking of human body motion is crucial for immersive experiences in AR/VR.
We present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers.
We show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
arXiv Detail & Related papers (2022-09-20T00:25:54Z) - Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - SensiX: A Platform for Collaborative Machine Learning on the Edge [69.1412199244903]
We present SensiX, a personal edge platform that stays between sensor data and sensing models.
We demonstrate its efficacy in developing motion and audio-based multi-device sensing systems.
Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.
arXiv Detail & Related papers (2020-12-04T23:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.