Related papers: QuaMo: Quaternion Motions for Vision-based 3D Human Kinematics Capture

QuaMo: Quaternion Motions for Vision-based 3D Human Kinematics Capture

URL: http://arxiv.org/abs/2601.19580v1
Date: Tue, 27 Jan 2026 13:12:08 GMT
Title: QuaMo: Quaternion Motions for Vision-based 3D Human Kinematics Capture
Authors: Cuong Le, Pavlo Melnyk, Urs Waldmann, Mårten Wadenbäck, Bastian Wandt,
Abstract summary: Vision-based 3D human motion capture from videos remains a challenge in computer vision.<n>Traditional 3D pose estimation approaches often ignore the temporal consistency between frames, causing implausible and jittery motion.<n>We propose QuaMo, a novel Quaternion Motions method using quaternion differential equations (QDE) for human kinematics capture.
Score: 14.741577220592161
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision-based 3D human motion capture from videos remains a challenge in computer vision. Traditional 3D pose estimation approaches often ignore the temporal consistency between frames, causing implausible and jittery motion. The emerging field of kinematics-based 3D motion capture addresses these issues by estimating the temporal transitioning between poses instead. A major drawback in current kinematics approaches is their reliance on Euler angles. Despite their simplicity, Euler angles suffer from discontinuity that leads to unstable motion reconstructions, especially in online settings where trajectory refinement is unavailable. Contrarily, quaternions have no discontinuity and can produce continuous transitions between poses. In this paper, we propose QuaMo, a novel Quaternion Motions method using quaternion differential equations (QDE) for human kinematics capture. We utilize the state-space model, an effective system for describing real-time kinematics estimations, with quaternion state and the QDE describing quaternion velocity. The corresponding angular acceleration is computed from a meta-PD controller with a novel acceleration enhancement that adaptively regulates the control signals as the human quickly changes to a new pose. Unlike previous work, our QDE is solved under the quaternion unit-sphere constraint that results in more accurate estimations. Experimental results show that our novel formulation of the QDE with acceleration enhancement accurately estimates 3D human kinematics with no discontinuity and minimal implausibilities. QuaMo outperforms comparable state-of-the-art methods on multiple datasets, namely Human3.6M, Fit3D, SportsPose and AIST. The code is available at https://github.com/cuongle1206/QuaMo

Related papers

Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis [53.48281548500864]
Motion 3-to-4 is a feed-forward framework for synthesising high-quality 4D dynamic objects from a single monocular video.<n>Our model learns a compact motion latent representation and predicts per-frame trajectories to recover complete robustness, temporally coherent geometry.
arXiv Detail & Related papers (2026-01-20T18:59:48Z)
DragMesh: Interactive 3D Generation Made Easy [12.832539752284466]
DragMesh is a robust framework for real-time interactive 3D articulation.<n>Our core contribution is a novel decoupled kinematic reasoning and motion generation framework.
arXiv Detail & Related papers (2025-12-06T13:10:44Z)
Geometric Neural Distance Fields for Learning Human Motion Priors [51.99890740169883]
We introduce a novel 3D generative human motion prior that enables robust, temporally consistent, and physically plausible 3D motion recovery.<n>Our experiments show significant and consistent gains: trained on the AMASS dataset, NRMF remarkably generalizes across multiple input modalities.
arXiv Detail & Related papers (2025-09-11T17:58:18Z)
EMoTive: Event-guided Trajectory Modeling for 3D Motion Estimation [59.33052312107478]
Event cameras offer possibilities for 3D motion estimation through continuous adaptive pixel-level responses to scene changes.<n>This paper presents EMove, a novel event-based framework that models-uniform trajectories via event-guided parametric curves.<n>For motion representation, we introduce a density-aware adaptation mechanism to fuse spatial and temporal features under event guidance.<n>The final 3D motion estimation is achieved through multi-temporal sampling of parametric trajectories, flows and depth motion fields.
arXiv Detail & Related papers (2025-03-14T13:15:54Z)
CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion-Blurred Images [14.738528284246545]
CRiM-GS is a textbfContinuous textbfRigid textbfMotion-aware textbfGaussian textbfSplatting.<n>It reconstructs precise 3D scenes from motion-blurred images while maintaining real-time rendering speed.
arXiv Detail & Related papers (2024-07-04T13:37:04Z)
Kinematics and Dynamics Modeling of 7 Degrees of Freedom Human Lower Limb Using Dual Quaternions Algebra [0.0]
This paper exploits dual quaternion theory to provide a fast and accurate solution for the forward and inverse kinematics and the Newton-Euler dynamics algorithm.
arXiv Detail & Related papers (2023-02-22T19:02:47Z)
Towards Single Camera Human 3D-Kinematics [15.559206592078425]
We propose a novel approach for direct 3D human kinematic estimation D3KE from videos using deep neural networks. Our experiments demonstrate that the proposed end-to-end training is robust and outperforms 2D and 3D markerless motion capture based kinematic estimation pipelines.
arXiv Detail & Related papers (2023-01-13T08:44:09Z)
MotionBERT: A Unified Perspective on Learning Human Motion Representations [46.67364057245364]
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources. We propose a pretraining stage in which a motion encoder is trained to recover the underlying 3D motion from noisy partial 2D observations. We implement motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer) neural network.
arXiv Detail & Related papers (2022-10-12T19:46:25Z)
D&D: Learning Human Dynamics from Dynamic Camera [55.60512353465175]
We present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera. Our approach is entirely neural-based and runs without offline optimization or simulation in physics engines.
arXiv Detail & Related papers (2022-09-19T06:51:02Z)
Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video. Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z)
Neural Monocular 3D Human Motion Capture with Physical Awareness [76.55971509794598]
We present a new trainable system for physically plausible markerless 3D human motion capture. Unlike most neural methods for human motion capture, our approach is aware of physical and environmental constraints. It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes.
arXiv Detail & Related papers (2021-05-03T17:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.