Related papers: IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds

IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds

URL: http://arxiv.org/abs/2304.12518v1
Date: Tue, 25 Apr 2023 02:13:24 GMT
Title: IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds
Authors: Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, Karan Ahuja
Abstract summary: We explore the feasibility of estimating body pose using IMUs already in devices that many users own. Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose.
Score: 41.8359507387665
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tracking body pose on-the-go could have powerful uses in fitness, mobile gaming, context-aware virtual assistants, and rehabilitation. However, users are unlikely to buy and wear special suits or sensor arrays to achieve this end. Instead, in this work, we explore the feasibility of estimating body pose using IMUs already in devices that many users own -- namely smartphones, smartwatches, and earbuds. This approach has several challenges, including noisy data from low-cost commodity IMUs, and the fact that the number of instrumentation points on a users body is both sparse and in flux. Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose. To evaluate our model, we created the IMUPoser Dataset, collected from 10 participants wearing or holding off-the-shelf consumer devices and across a variety of activity contexts. We provide a comprehensive evaluation of our system, benchmarking it on both our own and existing IMU datasets.

Related papers

MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices [9.50274333425178]
We introduce MobilePoser, a real-time system for full-body pose and global translation estimation. MobilePoser employs a physics-based motion estimation followed by a deep neural network for pose estimation, achieving state-of-the-art accuracy while remaining lightweight. We conclude with a series of applications to illustrate the unique potential of MobilePoser across a variety of fields, such as health and wellness, gaming, and indoor navigation to name a few.
arXiv Detail & Related papers (2025-04-16T21:19:47Z)
Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception [20.61555898129175]
We propose a novel scheme, called Prism, which can obtain high FUP accuracy on mobile devices. The core of Prism is to discover task-aware domains embedded in IMU dataset, and to train a domain-aware model on each identified domain. Results demonstrate that Prism can achieve the best FUP performance with a low latency.
arXiv Detail & Related papers (2025-01-03T02:07:42Z)
PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision [7.896850422430362]
Inertial Measurement Units (IMUs) embedded in personal devices have enabled significant applications in health and wellness. While labeled IMU data is scarce, we can collect unlabeled or weakly labeled IMU data to model human motions. For video or text modalities, the "pretrain and adapt" approach utilizes large volumes of unlabeled or weakly labeled data for pretraining, building a strong feature extractor, followed by adaptation to specific tasks using limited labeled data. This approach has not been widely adopted in the IMU domain for two reasons: (1) pretraining methods are poorly understood in the context of IMU, and
arXiv Detail & Related papers (2024-11-22T18:46:30Z)
Suite-IN: Aggregating Motion Features from Apple Suite for Robust Inertial Navigation [10.634236058278722]
Motion data captured by sensors on different body parts contains both local and global motion information. We propose a multi-device deep learning framework named Suite-IN, aggregating motion data from Apple Suite for inertial navigation.
arXiv Detail & Related papers (2024-11-12T14:23:52Z)
EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs [17.864281586189392]
Egocentric human pose estimation (HPE) using wearable sensors is essential for VR/AR applications. Most methods rely solely on either egocentric-view images or sparse Inertial Measurement Unit (IMU) signals. We propose EMHI, a multimodal textbfEgocentric human textbfMotion dataset with textbfHead-Mounted Display (HMD) and body-worn textbfIMUs.
arXiv Detail & Related papers (2024-08-30T10:12:13Z)
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition [24.217068565936117]
We present a novel method for action recognition that integrates motion data from body-worn IMUs with egocentric video. To model the complex relation of multiple IMU devices placed across the body, we exploit the collaborative dynamics in multiple IMU devices. Experiments show our method can achieve state-of-the-art performance on multiple public datasets.
arXiv Detail & Related papers (2024-07-09T07:53:16Z)
AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents [50.39555842254652]
We introduce the Android Multi-annotation EXpo (AMEX) to advance research on AI agents in mobile scenarios. AMEX comprises over 104K high-resolution screenshots from 110 popular mobile applications, which are annotated at multiple levels. AMEX includes three levels of annotations: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions.
arXiv Detail & Related papers (2024-07-03T17:59:58Z)
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases [81.70591346986582]
We introduce MobileAIBench, a benchmarking framework for evaluating Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices. MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices.
arXiv Detail & Related papers (2024-06-12T22:58:12Z)
3D Human Pose Perception from Egocentric Stereo Videos [67.9563319914377]
We propose a new transformer-based framework to improve egocentric stereo 3D human pose estimation. Our method is able to accurately estimate human poses even in challenging scenarios, such as crouching and sitting. We will release UnrealEgo2, UnrealEgo-RW, and trained models on our project page.
arXiv Detail & Related papers (2023-12-30T21:21:54Z)
SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data [1.494051815405093]
We introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from sparse data. Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses. We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices.
arXiv Detail & Related papers (2023-11-03T18:48:01Z)
Transformer Inertial Poser: Attention-based Real-time Human Motion Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time. Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z)
SensiX: A Platform for Collaborative Machine Learning on the Edge [69.1412199244903]
We present SensiX, a personal edge platform that stays between sensor data and sensing models. We demonstrate its efficacy in developing motion and audio-based multi-device sensing systems. Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.
arXiv Detail & Related papers (2020-12-04T23:06:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.