Related papers: Human Motion Prediction via Test-domain-aware Adaptation with Easily-available Human Motions Estimated from Videos

Human Motion Prediction via Test-domain-aware Adaptation with Easily-available Human Motions Estimated from Videos

URL: http://arxiv.org/abs/2505.07301v2
Date: Tue, 13 May 2025 11:34:56 GMT
Title: Human Motion Prediction via Test-domain-aware Adaptation with Easily-available Human Motions Estimated from Videos
Authors: Katsuki Shimbo, Hiromu Taketsugu, Norimichi Ukita,
Abstract summary: In 3D Human Motion Prediction (HMP), conventional methods train HMP models with expensive motion capture data.<n>This paper proposes to enhance HMP with additional learning using estimated poses from easily available videos.
Score: 12.363185535693276
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In 3D Human Motion Prediction (HMP), conventional methods train HMP models with expensive motion capture data. However, the data collection cost of such motion capture data limits the data diversity, which leads to poor generalizability to unseen motions or subjects. To address this issue, this paper proposes to enhance HMP with additional learning using estimated poses from easily available videos. The 2D poses estimated from the monocular videos are carefully transformed into motion capture-style 3D motions through our pipeline. By additional learning with the obtained motions, the HMP model is adapted to the test domain. The experimental results demonstrate the quantitative and qualitative impact of our method.

Related papers

Diffusion Model-based Activity Completion for AI Motion Capture from Videos [2.9271399793140076]
Current AI motion capture methods rely entirely on observed video sequences, similar to conventional motion capture.<n>We propose a diffusion-model-based action completion technique that generates complementary human motion sequences.<n>By introducing a gate module and a position-time embedding module, our approach achieves competitive results on the Human3.6M dataset.
arXiv Detail & Related papers (2025-05-27T05:04:50Z)
A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions [56.709280823844374]
We introduce a mask-based motion correction module (MCM) that leverages motion context and video mask to repair flawed motions.<n>We also propose a physics-based motion transfer module (PTM), which employs a pretrain and adapt approach for motion imitation.<n>Our approach is designed as a plug-and-play module to physically refine the video motion capture results, including high-difficulty in-the-wild motions.
arXiv Detail & Related papers (2024-12-23T08:26:00Z)
MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting [56.785233997533794]
We propose a novel deformable 3D Gaussian splatting framework called MotionGS. MotionGS explores explicit motion priors to guide the deformation of 3D Gaussians. Experiments in the monocular dynamic scenes validate that MotionGS surpasses state-of-the-art methods.
arXiv Detail & Related papers (2024-10-10T08:19:47Z)
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions. COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z)
DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos. Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion. Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z)
HMP: Hand Motion Priors for Pose and Shape Estimation from Video [52.39020275278984]
We develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions. Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios. We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets.
arXiv Detail & Related papers (2023-12-27T22:35:33Z)
DiffMesh: A Motion-aware Diffusion Framework for Human Mesh Recovery from Videos [20.895221536570627]
Human mesh recovery (HMR) provides rich human body information for various real-world applications.<n>Video-based approaches leverage temporal information to mitigate this issue.<n>We present DiffMesh, an innovative motion-aware Diffusion-like framework for video-based HMR.
arXiv Detail & Related papers (2023-03-23T16:15:18Z)
Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement [25.27559386977351]
Body motion is one of the most significant sources of noise when attempting to recover the subtle cardiac pulse from a video. We adapt a neural video synthesis approach to augment videos for the task of remote photoplethys. We demonstrate a 47% improvement over existing inter-dataset results using various state-of-the-art methods.
arXiv Detail & Related papers (2023-03-21T17:51:23Z)
HuMoR: 3D Human Motion Model for Robust Pose Estimation [100.55369985297797]
HuMoR is a 3D Human Motion Model for Robust Estimation of temporal pose and shape. We introduce a conditional variational autoencoder, which learns a distribution of the change in pose at each step of a motion sequence. We demonstrate that our model generalizes to diverse motions and body shapes after training on a large motion capture dataset.
arXiv Detail & Related papers (2021-05-10T21:04:55Z)
Synergetic Reconstruction from 2D Pose and 3D Motion for Wide-Space Multi-Person Video Motion Capture in the Wild [3.0015034534260665]
We propose a markerless motion capture method with accuracy and smoothness from multiple cameras. The proposed method predicts each persons 3D pose and determines bounding box of multi-camera images. We evaluated the proposed method using various datasets and a real sports field.
arXiv Detail & Related papers (2020-01-16T02:14:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.