HumanCM: One Step Human Motion Prediction
- URL: http://arxiv.org/abs/2510.16709v2
- Date: Thu, 23 Oct 2025 12:49:30 GMT
- Title: HumanCM: One Step Human Motion Prediction
- Authors: Liu Haojie, Gao Suixiang,
- Abstract summary: We present HumanCM, a one-step human motion prediction framework built upon consistency models.<n>HumanCM performs efficient single-step generation by learning a self-consistent mapping between noisy and clean motion states.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We present HumanCM, a one-step human motion prediction framework built upon consistency models. Instead of relying on multi-step denoising as in diffusion-based methods, HumanCM performs efficient single-step generation by learning a self-consistent mapping between noisy and clean motion states. The framework adopts a Transformer-based spatiotemporal architecture with temporal embeddings to model long-range dependencies and preserve motion coherence. Experiments on Human3.6M and HumanEva-I demonstrate that HumanCM achieves comparable or superior accuracy to state-of-the-art diffusion models while reducing inference steps by up to two orders of magnitude.
Related papers
- Semantic Belief-State World Model for 3D Human Motion Prediction [0.0]
We propose a Semantic Belief-State World Model that reframes human motion prediction as latent dynamical simulation on the human body manifold.<n>Inspired by belief-state world models developed for model-based reinforcement learning, SBWM adapts latent transitions and rollout-centric training to the domain of human motion.
arXiv Detail & Related papers (2026-01-07T02:06:26Z) - FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation [51.110607281391154]
FlowMo is a training-free guidance method for enhancing motion coherence in text-to-video models.<n>It estimates motion coherence by measuring the patch-wise variance across the temporal dimension and guides the model to reduce this variance dynamically during sampling.
arXiv Detail & Related papers (2025-06-01T19:55:33Z) - REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning [95.07708090428814]
We present REWIND, a one-step diffusion model for real-time, high-fidelity human motion estimation from egocentric image inputs.<n>We introduce cascaded body-hand denoising diffusion, which effectively models the correlation between egocentric body and hand motions.<n>We also propose a novel identity conditioning method based on a small set of pose exemplars of the target identity, which further enhances motion estimation quality.
arXiv Detail & Related papers (2025-04-07T11:44:11Z) - MDMP: Multi-modal Diffusion for supervised Motion Predictions with uncertainty [7.402769693163035]
This paper introduces a Multi-modal Diffusion model for Motion Prediction (MDMP)<n>It integrates skeletal data and textual descriptions of actions to generate refined long-term motion predictions with quantifiable uncertainty.<n>Our model consistently outperforms existing generative techniques in accurately predicting long-term motions.
arXiv Detail & Related papers (2024-10-04T18:49:00Z) - Bayesian-Optimized One-Step Diffusion Model with Knowledge Distillation for Real-Time 3D Human Motion Prediction [2.402745776249116]
We propose training a one-step multi-layer perceptron-based (MLP-based) diffusion model for motion prediction using knowledge distillation and Bayesian optimization.
Our model can significantly improve the inference speed, achieving real-time prediction without noticeable degradation in performance.
arXiv Detail & Related papers (2024-09-19T04:36:40Z) - MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds [20.83684434910106]
We present MoManifold, a novel human motion prior, which models plausible human motion in continuous high-dimensional motion space.
Specifically, we propose novel decoupled joint acceleration to model human dynamics from existing limited motion data.
Extensive experiments demonstrate that MoManifold outperforms existing SOTAs as a prior in several downstream tasks.
arXiv Detail & Related papers (2024-09-01T15:00:16Z) - COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions.
COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z) - DPoser: Diffusion Model as Robust 3D Human Pose Prior [51.75784816929666]
We introduce DPoser, a robust and versatile human pose prior built upon diffusion models.
DPoser regards various pose-centric tasks as inverse problems and employs variational diffusion sampling for efficient solving.
Our approach demonstrates considerable enhancements over common uniform scheduling used in image domains, boasting improvements of 5.4%, 17.2%, and 3.8% across human mesh recovery, pose completion, and motion denoising, respectively.
arXiv Detail & Related papers (2023-12-09T11:18:45Z) - TransFusion: A Practical and Effective Transformer-based Diffusion Model
for 3D Human Motion Prediction [1.8923948104852863]
We propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction.
Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers.
In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization, we treat all inputs, including conditions, as tokens to create a more lightweight model.
arXiv Detail & Related papers (2023-07-30T01:52:07Z) - Persistent-Transient Duality: A Multi-mechanism Approach for Modeling
Human-Object Interaction [58.67761673662716]
Humans are highly adaptable, swiftly switching between different modes to handle different tasks, situations and contexts.
In Human-object interaction (HOI) activities, these modes can be attributed to two mechanisms: (1) the large-scale consistent plan for the whole activity and (2) the small-scale children interactive actions that start and end along the timeline.
This work proposes to model two concurrent mechanisms that jointly control human motion.
arXiv Detail & Related papers (2023-07-24T12:21:33Z) - Executing your Commands via Motion Diffusion in Latent Space [51.64652463205012]
We propose a Motion Latent-based Diffusion model (MLD) to produce vivid motion sequences conforming to the given conditional inputs.
Our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks.
arXiv Detail & Related papers (2022-12-08T03:07:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.