Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control
- URL: http://arxiv.org/abs/2503.11801v1
- Date: Fri, 14 Mar 2025 18:42:29 GMT
- Title: Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control
- Authors: Xiaoyu Huang, Takara Truong, Yunbo Zhang, Fangzhou Yu, Jean Pierre Sleiman, Jessica Hodgins, Koushil Sreenath, Farbod Farshidian,
- Abstract summary: We present Diffuse-CLoC, a guided diffusion framework for physics-based look-ahead control.<n>It enables intuitive, steerable, and physically realistic motion generation.
- Score: 16.319698848279966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Diffuse-CLoC, a guided diffusion framework for physics-based look-ahead control that enables intuitive, steerable, and physically realistic motion generation. While existing kinematics motion generation with diffusion models offer intuitive steering capabilities with inference-time conditioning, they often fail to produce physically viable motions. In contrast, recent diffusion-based control policies have shown promise in generating physically realizable motion sequences, but the lack of kinematics prediction limits their steerability. Diffuse-CLoC addresses these challenges through a key insight: modeling the joint distribution of states and actions within a single diffusion model makes action generation steerable by conditioning it on the predicted states. This approach allows us to leverage established conditioning techniques from kinematic motion generation while producing physically realistic motions. As a result, we achieve planning capabilities without the need for a high-level planner. Our method handles a diverse set of unseen long-horizon downstream tasks through a single pre-trained model, including static and dynamic obstacle avoidance, motion in-betweening, and task-space control. Experimental results show that our method significantly outperforms the traditional hierarchical framework of high-level motion diffusion and low-level tracking.
Related papers
- PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning [38.004463823796286]
We propose PRIMAL, an autoregressive diffusion model that is learned with a two-stage paradigm.
In the pretraining stage, the model learns motion dynamics from a large number of sub-second motion segments.
In the adaptation phase, we employ a ControlNet-like adaptor to fine-tune the motor control for semantic action generation and spatial target reaching.
arXiv Detail & Related papers (2025-03-21T21:27:57Z) - Controllable Motion Generation via Diffusion Modal Coupling [14.004287903552534]
We propose a novel framework that enhances controllability in diffusion models by leveraging multi-modal prior distributions.
We evaluate our approach on motion prediction using a dataset and multi-task control in Maze2D environments.
arXiv Detail & Related papers (2025-03-04T07:22:34Z) - Motion-Aware Generative Frame Interpolation [23.380470636851022]
Flow-based frame methods ensure motion stability through estimated intermediate flow but often introduce severe artifacts in complex motion regions.
Recent generative approaches, boosted by large-scale pre-trained video generation models, show promise in handling intricate scenes.
We propose Motion-aware Generative frame (MoG) that synergizes intermediate flow guidance with generative capacities to enhance fidelity.
arXiv Detail & Related papers (2025-01-07T11:03:43Z) - A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions [56.709280823844374]
We introduce a mask-based motion correction module (MCM) that leverages motion context and video mask to repair flawed motions.<n>We also propose a physics-based motion transfer module (PTM), which employs a pretrain and adapt approach for motion imitation.<n>Our approach is designed as a plug-and-play module to physically refine the video motion capture results, including high-difficulty in-the-wild motions.
arXiv Detail & Related papers (2024-12-23T08:26:00Z) - Mojito: Motion Trajectory and Intensity Control for Video Generation [79.85687620761186]
This paper introduces Mojito, a diffusion model that incorporates both motion trajectory and intensity control for text-to-video generation.<n>Experiments demonstrate Mojito's effectiveness in achieving precise trajectory and intensity control with high computational efficiency.
arXiv Detail & Related papers (2024-12-12T05:26:43Z) - Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation [10.5019872575418]
We propose a novel zero-shot moving object trajectory control framework, Motion-Zero, to enable a bounding-box-trajectories-controlled text-to-video diffusion model.<n>Our method can be flexibly applied to various state-of-the-art video diffusion models without any training process.
arXiv Detail & Related papers (2024-01-18T17:22:37Z) - TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models [75.20168902300166]
We propose TrackDiffusion, a novel video generation framework affording fine-grained trajectory-conditioned motion control.
A pivotal component of TrackDiffusion is the instance enhancer, which explicitly ensures inter-frame consistency of multiple objects.
generated video sequences by our TrackDiffusion can be used as training data for visual perception models.
arXiv Detail & Related papers (2023-12-01T15:24:38Z) - Exploring Model Transferability through the Lens of Potential Energy [78.60851825944212]
Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models.
Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels.
We present an insightful physics-inspired approach named PED to address these challenges.
arXiv Detail & Related papers (2023-08-29T07:15:57Z) - Interactive Character Control with Auto-Regressive Motion Diffusion Models [18.727066177880708]
We propose A-MDM (Auto-regressive Motion Diffusion Model) for real-time motion synthesis.
Our conditional diffusion model takes an initial pose as input, and auto-regressively generates successive motion frames conditioned on previous frame.
We introduce a suite of techniques for incorporating interactive controls into A-MDM, such as task-oriented sampling, in-painting, and hierarchical reinforcement learning.
arXiv Detail & Related papers (2023-06-01T07:48:34Z) - Executing your Commands via Motion Diffusion in Latent Space [51.64652463205012]
We propose a Motion Latent-based Diffusion model (MLD) to produce vivid motion sequences conforming to the given conditional inputs.
Our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks.
arXiv Detail & Related papers (2022-12-08T03:07:00Z) - PhysDiff: Physics-Guided Human Motion Diffusion Model [101.1823574561535]
Existing motion diffusion models largely disregard the laws of physics in the diffusion process.
PhysDiff incorporates physical constraints into the diffusion process.
Our approach achieves state-of-the-art motion quality and improves physical plausibility drastically.
arXiv Detail & Related papers (2022-12-05T18:59:52Z) - Learning to Jump from Pixels [23.17535989519855]
We present Depth-based Impulse Control (DIC), a method for synthesizing highly agile visually-guided behaviors.
DIC affords the flexibility of model-free learning but regularizes behavior through explicit model-based optimization of ground reaction forces.
We evaluate the proposed method both in simulation and in the real world.
arXiv Detail & Related papers (2021-10-28T17:53:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.