Related papers: Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

URL: http://arxiv.org/abs/2602.13778v1
Date: Sat, 14 Feb 2026 13:48:13 GMT
Title: Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation
Authors: Jidong Jia, Youjian Zhang, Huan Fu, Dacheng Tao,
Abstract summary: Motions that look plausible as joint trajectories often exhibit body self-penetration and Foot-Ground Contact (FGC) anomalies when visualized with a human body mesh.<n>We address this skeleton-to-mesh gap by deriving physics-based rewards from the body mesh.<n>Our method can significantly improve the physical plausibility of generated motions, yielding more realistic and aesthetically pleasing dances.
Score: 49.50118203284611
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite advances in dance generation, most methods are trained in the skeletal domain and ignore mesh-level physical constraints. As a result, motions that look plausible as joint trajectories often exhibit body self-penetration and Foot-Ground Contact (FGC) anomalies when visualized with a human body mesh, reducing the aesthetic appeal of generated dances and limiting their real-world applications. We address this skeleton-to-mesh gap by deriving physics-based rewards from the body mesh and applying Reinforcement Learning Fine-Tuning (RLFT) to steer the diffusion model toward physically plausible motion synthesis under mesh visualization. Our reward design combines (i) an imitation reward that measures a motion's general plausibility by its imitability in a physical simulator (penalizing penetration and foot skating), and (ii) a Foot-Ground Deviation (FGD) reward with test-time FGD guidance to better capture the dynamic foot-ground interaction in dance. However, we find that the physics-based rewards tend to push the model to generate freezing motions for fewer physical anomalies and better imitability. To mitigate it, we propose an anti-freezing reward to preserve motion dynamics while maintaining physical plausibility. Experiments on multiple dance datasets consistently demonstrate that our method can significantly improve the physical plausibility of generated motions, yielding more realistic and aesthetically pleasing dances. The project page is available at: https://jjd1123.github.io/Skeleton2Stage/

Related papers

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation [53.06495362038348]
Existing generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability.<n>We introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control.<n> Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos.
arXiv Detail & Related papers (2025-09-24T17:58:04Z)
Half-Physics: Enabling Kinematic 3D Human Model with Physical Interactions [89.88331682333198]
We introduce a novel approach that embeds SMPL-X into a tangible entity capable of dynamic physical interactions with its surroundings.<n>Our approach maintains kinematic control over inherent SMPL-X poses while ensuring physically plausible interactions with scenes and objects.<n>Unlike reinforcement learning-based methods, which demand extensive and complex training, our half-physics method is learning-free and generalizes to any body shape and motion.
arXiv Detail & Related papers (2025-07-31T17:58:33Z)
PhysiInter: Integrating Physical Mapping for High-Fidelity Human Interaction Generation [35.563978243352764]
We introduce physical mapping, integrated throughout the human interaction generation pipeline.<n>Specifically, motion imitation within a physics-based simulation environment is used to project target motions into a physically valid space.<n>Experiments show our method achieves impressive results in generated human motion quality, with a 3%-89% improvement in physical fidelity.
arXiv Detail & Related papers (2025-06-09T06:04:49Z)
PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation [51.2555550979386]
Plausibility-Aware Motion Diffusion (PAMD) is a framework for generating dances that are both musically aligned and physically realistic.<n>To provide more effective guidance during generation, we incorporate Prior Motion Guidance (PMG)<n>Experiments show that PAMD significantly improves musical alignment and enhances the physical plausibility of generated motions.
arXiv Detail & Related papers (2025-05-26T14:44:09Z)
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance [6.0892117964531955]
FinePhys is a Fine-grained human action generation framework that incorporates Physics to obtain effective skeletal guidance.<n>FinePhys first estimates 2D poses in an online manner and then performs 2D-to-3D lifting via dimension in-context learning.<n>To mitigate the instability and limited interpretability of purely data-driven 3D poses, we introduce a physics-based motion re-estimation module.
arXiv Detail & Related papers (2025-05-19T17:58:11Z)
Morph: A Motion-free Physics Optimization Framework for Human Motion Generation [28.009524143770076]
Current motion generation approaches disregard physics constraints, resulting in physically implausible motions.<n>We propose textbfMorph, a framework for training an effective motion physics with noisy motion data.<n>Our framework achieves state-of-the-art motion quality while improving physical plausibility drastically.
arXiv Detail & Related papers (2024-11-22T14:09:56Z)
ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model [9.525806425270428]
We present emphReinDiffuse that combines reinforcement learning with motion diffusion model to generate physically credible human motions. Our method adapts Motion Diffusion Model to output a parameterized distribution of actions, making them compatible with reinforcement learning paradigms. Our approach outperforms existing state-of-the-art models on two major datasets, HumanML3D and KIT-ML.
arXiv Detail & Related papers (2024-10-09T16:24:11Z)
Physics-Guided Human Motion Capture with Pose Probability Modeling [35.159506668475565]
Existing solutions always adopt kinematic results as reference motions, and the physics is treated as a post-processing module. We employ physics as denoising guidance in the reverse diffusion process to reconstruct human motion from a modeled pose probability distribution. With several iterations, the physics-based tracking and kinematic denoising promote each other to generate a physically plausible human motion.
arXiv Detail & Related papers (2023-08-19T05:28:03Z)
PhysDiff: Physics-Guided Human Motion Diffusion Model [101.1823574561535]
Existing motion diffusion models largely disregard the laws of physics in the diffusion process. PhysDiff incorporates physical constraints into the diffusion process. Our approach achieves state-of-the-art motion quality and improves physical plausibility drastically.
arXiv Detail & Related papers (2022-12-05T18:59:52Z)
Neural MoCon: Neural Motion Control for Physically Plausible Human Motion Capture [12.631678059354593]
We exploit the high-precision and non-differentiable physics simulator to incorporate dynamical constraints in motion capture. Our key-idea is to use real physical supervisions to train a target pose distribution prior for sampling-based motion control. Results show that we can obtain physically plausible human motion with complex terrain interactions, human shape variations, and diverse behaviors.
arXiv Detail & Related papers (2022-03-26T12:48:41Z)
Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors. We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.