IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and
Earbuds
- URL: http://arxiv.org/abs/2304.12518v1
- Date: Tue, 25 Apr 2023 02:13:24 GMT
- Title: IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and
Earbuds
- Authors: Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, Karan Ahuja
- Abstract summary: We explore the feasibility of estimating body pose using IMUs already in devices that many users own.
Our pipeline receives whatever subset of IMU data is available, potentially from just a single device, and produces a best-guess pose.
- Score: 41.8359507387665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tracking body pose on-the-go could have powerful uses in fitness, mobile
gaming, context-aware virtual assistants, and rehabilitation. However, users
are unlikely to buy and wear special suits or sensor arrays to achieve this
end. Instead, in this work, we explore the feasibility of estimating body pose
using IMUs already in devices that many users own -- namely smartphones,
smartwatches, and earbuds. This approach has several challenges, including
noisy data from low-cost commodity IMUs, and the fact that the number of
instrumentation points on a users body is both sparse and in flux. Our pipeline
receives whatever subset of IMU data is available, potentially from just a
single device, and produces a best-guess pose. To evaluate our model, we
created the IMUPoser Dataset, collected from 10 participants wearing or holding
off-the-shelf consumer devices and across a variety of activity contexts. We
provide a comprehensive evaluation of our system, benchmarking it on both our
own and existing IMU datasets.
Related papers
- PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision [7.896850422430362]
Inertial Measurement Units (IMUs) embedded in personal devices have enabled significant applications in health and wellness.
While labeled IMU data is scarce, we can collect unlabeled or weakly labeled IMU data to model human motions.
For video or text modalities, the "pretrain and adapt" approach utilizes large volumes of unlabeled or weakly labeled data for pretraining, building a strong feature extractor, followed by adaptation to specific tasks using limited labeled data.
This approach has not been widely adopted in the IMU domain for two reasons: (1) pretraining methods are poorly understood in the context of IMU, and
arXiv Detail & Related papers (2024-11-22T18:46:30Z) - Suite-IN: Aggregating Motion Features from Apple Suite for Robust Inertial Navigation [10.634236058278722]
Motion data captured by sensors on different body parts contains both local and global motion information.
We propose a multi-device deep learning framework named Suite-IN, aggregating motion data from Apple Suite for inertial navigation.
arXiv Detail & Related papers (2024-11-12T14:23:52Z) - EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs [17.864281586189392]
Egocentric human pose estimation (HPE) using wearable sensors is essential for VR/AR applications.
Most methods rely solely on either egocentric-view images or sparse Inertial Measurement Unit (IMU) signals.
We propose EMHI, a multimodal textbfEgocentric human textbfMotion dataset with textbfHead-Mounted Display (HMD) and body-worn textbfIMUs.
arXiv Detail & Related papers (2024-08-30T10:12:13Z) - Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition [24.217068565936117]
We present a novel method for action recognition that integrates motion data from body-worn IMUs with egocentric video.
To model the complex relation of multiple IMU devices placed across the body, we exploit the collaborative dynamics in multiple IMU devices.
Experiments show our method can achieve state-of-the-art performance on multiple public datasets.
arXiv Detail & Related papers (2024-07-09T07:53:16Z) - AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents [50.39555842254652]
We introduce the Android Multi-annotation EXpo (AMEX) to advance research on AI agents in mobile scenarios.
AMEX comprises over 104K high-resolution screenshots from 110 popular mobile applications, which are annotated at multiple levels.
AMEX includes three levels of annotations: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions.
arXiv Detail & Related papers (2024-07-03T17:59:58Z) - MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases [81.70591346986582]
We introduce MobileAIBench, a benchmarking framework for evaluating Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices.
MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices.
arXiv Detail & Related papers (2024-06-12T22:58:12Z) - 3D Human Pose Perception from Egocentric Stereo Videos [67.9563319914377]
We propose a new transformer-based framework to improve egocentric stereo 3D human pose estimation.
Our method is able to accurately estimate human poses even in challenging scenarios, such as crouching and sitting.
We will release UnrealEgo2, UnrealEgo-RW, and trained models on our project page.
arXiv Detail & Related papers (2023-12-30T21:21:54Z) - SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data [1.494051815405093]
We introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from sparse data.
Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses.
We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices.
arXiv Detail & Related papers (2023-11-03T18:48:01Z) - Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z) - SensiX: A Platform for Collaborative Machine Learning on the Edge [69.1412199244903]
We present SensiX, a personal edge platform that stays between sensor data and sensing models.
We demonstrate its efficacy in developing motion and audio-based multi-device sensing systems.
Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.
arXiv Detail & Related papers (2020-12-04T23:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.