Related papers: HumanCM: One Step Human Motion Prediction

HumanCM: One Step Human Motion Prediction

URL: http://arxiv.org/abs/2510.16709v2
Date: Thu, 23 Oct 2025 12:49:30 GMT
Title: HumanCM: One Step Human Motion Prediction
Authors: Liu Haojie, Gao Suixiang,
Abstract summary: We present HumanCM, a one-step human motion prediction framework built upon consistency models.<n>HumanCM performs efficient single-step generation by learning a self-consistent mapping between noisy and clean motion states.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: We present HumanCM, a one-step human motion prediction framework built upon consistency models. Instead of relying on multi-step denoising as in diffusion-based methods, HumanCM performs efficient single-step generation by learning a self-consistent mapping between noisy and clean motion states. The framework adopts a Transformer-based spatiotemporal architecture with temporal embeddings to model long-range dependencies and preserve motion coherence. Experiments on Human3.6M and HumanEva-I demonstrate that HumanCM achieves comparable or superior accuracy to state-of-the-art diffusion models while reducing inference steps by up to two orders of magnitude.

Related papers

Semantic Belief-State World Model for 3D Human Motion Prediction [0.0]
We propose a Semantic Belief-State World Model that reframes human motion prediction as latent dynamical simulation on the human body manifold.<n>Inspired by belief-state world models developed for model-based reinforcement learning, SBWM adapts latent transitions and rollout-centric training to the domain of human motion.
arXiv Detail & Related papers (2026-01-07T02:06:26Z)
FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation [51.110607281391154]
FlowMo is a training-free guidance method for enhancing motion coherence in text-to-video models.<n>It estimates motion coherence by measuring the patch-wise variance across the temporal dimension and guides the model to reduce this variance dynamically during sampling.
arXiv Detail & Related papers (2025-06-01T19:55:33Z)
REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning [95.07708090428814]
We present REWIND, a one-step diffusion model for real-time, high-fidelity human motion estimation from egocentric image inputs.<n>We introduce cascaded body-hand denoising diffusion, which effectively models the correlation between egocentric body and hand motions.<n>We also propose a novel identity conditioning method based on a small set of pose exemplars of the target identity, which further enhances motion estimation quality.
arXiv Detail & Related papers (2025-04-07T11:44:11Z)
MDMP: Multi-modal Diffusion for supervised Motion Predictions with uncertainty [7.402769693163035]
This paper introduces a Multi-modal Diffusion model for Motion Prediction (MDMP)<n>It integrates skeletal data and textual descriptions of actions to generate refined long-term motion predictions with quantifiable uncertainty.<n>Our model consistently outperforms existing generative techniques in accurately predicting long-term motions.
arXiv Detail & Related papers (2024-10-04T18:49:00Z)
Bayesian-Optimized One-Step Diffusion Model with Knowledge Distillation for Real-Time 3D Human Motion Prediction [2.402745776249116]
We propose training a one-step multi-layer perceptron-based (MLP-based) diffusion model for motion prediction using knowledge distillation and Bayesian optimization. Our model can significantly improve the inference speed, achieving real-time prediction without noticeable degradation in performance.
arXiv Detail & Related papers (2024-09-19T04:36:40Z)
MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds [20.83684434910106]
We present MoManifold, a novel human motion prior, which models plausible human motion in continuous high-dimensional motion space. Specifically, we propose novel decoupled joint acceleration to model human dynamics from existing limited motion data. Extensive experiments demonstrate that MoManifold outperforms existing SOTAs as a prior in several downstream tasks.
arXiv Detail & Related papers (2024-09-01T15:00:16Z)
COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation [98.05046790227561]
COIN is a control-inpainting motion diffusion prior that enables fine-grained control to disentangle human and camera motions. COIN outperforms the state-of-the-art methods in terms of global human motion estimation and camera motion estimation.
arXiv Detail & Related papers (2024-08-29T10:36:29Z)
DPoser: Diffusion Model as Robust 3D Human Pose Prior [51.75784816929666]
We introduce DPoser, a robust and versatile human pose prior built upon diffusion models. DPoser regards various pose-centric tasks as inverse problems and employs variational diffusion sampling for efficient solving. Our approach demonstrates considerable enhancements over common uniform scheduling used in image domains, boasting improvements of 5.4%, 17.2%, and 3.8% across human mesh recovery, pose completion, and motion denoising, respectively.
arXiv Detail & Related papers (2023-12-09T11:18:45Z)
TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction [1.8923948104852863]
We propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction. Our model leverages Transformer as the backbone with long skip connections between shallow and deep layers. In contrast to prior diffusion-based models that utilize extra modules like cross-attention and adaptive layer normalization, we treat all inputs, including conditions, as tokens to create a more lightweight model.
arXiv Detail & Related papers (2023-07-30T01:52:07Z)
Persistent-Transient Duality: A Multi-mechanism Approach for Modeling Human-Object Interaction [58.67761673662716]
Humans are highly adaptable, swiftly switching between different modes to handle different tasks, situations and contexts. In Human-object interaction (HOI) activities, these modes can be attributed to two mechanisms: (1) the large-scale consistent plan for the whole activity and (2) the small-scale children interactive actions that start and end along the timeline. This work proposes to model two concurrent mechanisms that jointly control human motion.
arXiv Detail & Related papers (2023-07-24T12:21:33Z)
Executing your Commands via Motion Diffusion in Latent Space [51.64652463205012]
We propose a Motion Latent-based Diffusion model (MLD) to produce vivid motion sequences conforming to the given conditional inputs. Our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks.
arXiv Detail & Related papers (2022-12-08T03:07:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.