PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture
- URL: http://arxiv.org/abs/2409.14101v1
- Date: Sat, 21 Sep 2024 10:51:16 GMT
- Title: PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture
- Authors: Zhuojun Li, Chun Yu, Chen Liang, Yuanchun Shi,
- Abstract summary: We propose PoseAugment, a novel pipeline incorporating VAE-based pose generation and physical optimization.
Given a pose sequence, the VAE module generates infinite poses with both high fidelity and diversity, while keeping the data distribution.
High-quality IMU data are then synthesized from the augmented poses for training motion capture models.
- Score: 40.765438433729344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The data scarcity problem is a crucial factor that hampers the model performance of IMU-based human motion capture. However, effective data augmentation for IMU-based motion capture is challenging, since it has to capture the physical relations and constraints of the human body, while maintaining the data distribution and quality. We propose PoseAugment, a novel pipeline incorporating VAE-based pose generation and physical optimization. Given a pose sequence, the VAE module generates infinite poses with both high fidelity and diversity, while keeping the data distribution. The physical module optimizes poses to satisfy physical constraints with minimal motion restrictions. High-quality IMU data are then synthesized from the augmented poses for training motion capture models. Experiments show that PoseAugment outperforms previous data augmentation and pose generation methods in terms of motion capture accuracy, revealing a strong potential of our method to alleviate the data collection burden for IMU-based motion capture and related tasks driven by human poses.
Related papers
- DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior [82.9526308672547]
We present DPoser-X, a diffusion-based prior model for 3D whole-body human poses.<n>Our approach unifies various pose-centric tasks as inverse problems, solving them through variational diffusion sampling.<n>Our model consistently outperforms state-of-the-art alternatives, establishing a new benchmark for whole-body human pose prior modeling.
arXiv Detail & Related papers (2025-08-01T12:56:39Z) - Human Motion Capture from Loose and Sparse Inertial Sensors with Garment-aware Diffusion Models [25.20942802233326]
We present a new task of full-body human pose estimation using sparse, loosely attached IMU sensors.<n>We developed transformer-based diffusion models to synthesize loose IMU data and estimate human poses based on this challenging loose IMU data.
arXiv Detail & Related papers (2025-06-18T09:16:36Z) - PhysiInter: Integrating Physical Mapping for High-Fidelity Human Interaction Generation [35.563978243352764]
We introduce physical mapping, integrated throughout the human interaction generation pipeline.<n>Specifically, motion imitation within a physics-based simulation environment is used to project target motions into a physically valid space.<n>Experiments show our method achieves impressive results in generated human motion quality, with a 3%-89% improvement in physical fidelity.
arXiv Detail & Related papers (2025-06-09T06:04:49Z) - Human Motion Prediction via Test-domain-aware Adaptation with Easily-available Human Motions Estimated from Videos [12.363185535693276]
In 3D Human Motion Prediction (HMP), conventional methods train HMP models with expensive motion capture data.<n>This paper proposes to enhance HMP with additional learning using estimated poses from easily available videos.
arXiv Detail & Related papers (2025-05-12T07:45:57Z) - GENMO: A GENeralist Model for Human MOtion [64.16188966024542]
We present GENMO, a unified Generalist Model for Human Motion that bridges motion estimation and generation in a single framework.<n>Our key insight is to reformulate motion estimation as constrained motion generation, where the output motion must precisely satisfy observed conditioning signals.<n>Our novel architecture handles variable-length motions and mixed multimodal conditions (text, audio, video) at different time intervals, offering flexible control.
arXiv Detail & Related papers (2025-05-02T17:59:55Z) - A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions [56.709280823844374]
We introduce a mask-based motion correction module (MCM) that leverages motion context and video mask to repair flawed motions.
We also propose a physics-based motion transfer module (PTM), which employs a pretrain and adapt approach for motion imitation.
Our approach is designed as a plug-and-play module to physically refine the video motion capture results, including high-difficulty in-the-wild motions.
arXiv Detail & Related papers (2024-12-23T08:26:00Z) - One-shot Human Motion Transfer via Occlusion-Robust Flow Prediction and Neural Texturing [21.613055849276385]
We propose a unified framework that combines multi-scale feature warping and neural texture mapping to recover better 2D appearance and 2.5D geometry.
Our model takes advantage of multiple modalities by jointly training and fusing them, which allows it to robust neural texture features that cope with geometric errors.
arXiv Detail & Related papers (2024-12-09T03:14:40Z) - Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models [70.78051873517285]
We present MotionBase, the first million-level motion generation benchmark.
By leveraging this vast dataset, our large motion model demonstrates strong performance across a broad range of motions.
We introduce a novel 2D lookup-free approach for motion tokenization, which preserves motion information and expands codebook capacity.
arXiv Detail & Related papers (2024-10-04T10:48:54Z) - Shape Conditioned Human Motion Generation with Diffusion Model [0.0]
We propose a Shape-conditioned Motion Diffusion model (SMD), which enables the generation of motion sequences directly in mesh format.
We also propose a Spectral-Temporal Autoencoder (STAE) to leverage cross-temporal dependencies within the spectral domain.
arXiv Detail & Related papers (2024-05-10T19:06:41Z) - Scaling Up Dynamic Human-Scene Interaction Modeling [58.032368564071895]
TRUMANS is the most comprehensive motion-captured HSI dataset currently available.
It intricately captures whole-body human motions and part-level object dynamics.
We devise a diffusion-based autoregressive model that efficiently generates HSI sequences of any length.
arXiv Detail & Related papers (2024-03-13T15:45:04Z) - LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
Environment [59.320414108383055]
We present LiveHPS, a novel single-LiDAR-based approach for scene-level human pose and shape estimation.
We propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses.
arXiv Detail & Related papers (2024-02-27T03:08:44Z) - IMUGPT 2.0: Language-Based Cross Modality Transfer for Sensor-Based
Human Activity Recognition [0.19791587637442667]
Cross modality transfer approaches convert existing datasets from a source modality, such as video, to a target modality (IMU)
We introduce two new extensions for IMUGPT that enhance its use for practical HAR application scenarios.
We demonstrate that our diversity metrics can reduce the effort needed for the generation of virtual IMU data by at least 50%.
arXiv Detail & Related papers (2024-02-01T22:37:33Z) - Motion Matters: Neural Motion Transfer for Better Camera Physiological
Measurement [25.27559386977351]
Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video.
We adapt a neural video synthesis approach to augment videos for the task of remote photoplethys.
We demonstrate a 47% improvement over existing inter-dataset results using various state-of-the-art methods.
arXiv Detail & Related papers (2023-03-21T17:51:23Z) - Transformer Inertial Poser: Attention-based Real-time Human Motion
Reconstruction from Sparse IMUs [79.72586714047199]
We propose an attention-based deep learning method to reconstruct full-body motion from six IMU sensors in real-time.
Our method achieves new state-of-the-art results both quantitatively and qualitatively, while being simple to implement and smaller in size.
arXiv Detail & Related papers (2022-03-29T16:24:52Z) - HuMoR: 3D Human Motion Model for Robust Pose Estimation [100.55369985297797]
HuMoR is a 3D Human Motion Model for Robust Estimation of temporal pose and shape.
We introduce a conditional variational autoencoder, which learns a distribution of the change in pose at each step of a motion sequence.
We demonstrate that our model generalizes to diverse motions and body shapes after training on a large motion capture dataset.
arXiv Detail & Related papers (2021-05-10T21:04:55Z) - Graph-based Normalizing Flow for Human Motion Generation and
Reconstruction [20.454140530081183]
We propose a probabilistic generative model to synthesize and reconstruct long horizon motion sequences conditioned on past information and control signals.
We evaluate the models on a mixture of motion capture datasets of human locomotion with foot-step and bone-length analysis.
arXiv Detail & Related papers (2021-04-07T09:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.