Related papers: Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis

Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis

URL: http://arxiv.org/abs/2508.02106v1
Date: Mon, 04 Aug 2025 06:35:48 GMT
Title: Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis
Authors: Kaiyang Ji, Ye Shi, Zichen Jin, Kangyi Chen, Lan Xu, Yuexin Ma, Jingyi Yu, Jingya Wang,
Abstract summary: Human-X is a novel framework designed to enable immersive and physically plausible human interactions across diverse entities.<n>Our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner.<n>Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction.
Score: 51.95817740348585
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Real-time synthesis of physically plausible human interactions remains a critical challenge for immersive VR/AR systems and humanoid robotics. While existing methods demonstrate progress in kinematic motion generation, they often fail to address the fundamental tension between real-time responsiveness, physical feasibility, and safety requirements in dynamic human-machine interactions. We introduce Human-X, a novel framework designed to enable immersive and physically plausible human interactions across diverse entities, including human-avatar, human-humanoid, and human-robot systems. Unlike existing approaches that focus on post-hoc alignment or simplified physics, our method jointly predicts actions and reactions in real-time using an auto-regressive reaction diffusion planner, ensuring seamless synchronization and context-aware responses. To enhance physical realism and safety, we integrate an actor-aware motion tracking policy trained with reinforcement learning, which dynamically adapts to interaction partners' movements while avoiding artifacts like foot sliding and penetration. Extensive experiments on the Inter-X and InterHuman datasets demonstrate significant improvements in motion quality, interaction continuity, and physical plausibility over state-of-the-art methods. Our framework is validated in real-world applications, including virtual reality interface for human-robot interaction, showcasing its potential for advancing human-robot collaboration.

Related papers

Half-Physics: Enabling Kinematic 3D Human Model with Physical Interactions [88.01918532202716]
We introduce a novel approach that embeds SMPL-X into a tangible entity capable of dynamic physical interactions with its surroundings.<n>Our approach maintains kinematic control over inherent SMPL-X poses while ensuring physically plausible interactions with scenes and objects.<n>Unlike reinforcement learning-based methods, which demand extensive and complex training, our half-physics method is learning-free and generalizes to any body shape and motion.
arXiv Detail & Related papers (2025-07-31T17:58:33Z)
OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains [66.62502882481373]
Current methods tend to focus either on the body or the hands, which limits their ability to produce cohesive and realistic interactions.<n>We propose OOD-HOI, a text-driven framework for generating whole-body human-object interactions that generalize well to new objects and actions.<n>Our approach integrates a dual-branch reciprocal diffusion model to synthesize initial interaction poses, a contact-guided interaction refiner to improve physical accuracy based on predicted contact areas, and a dynamic adaptation mechanism which includes semantic adjustment and geometry deformation to improve robustness.
arXiv Detail & Related papers (2024-11-27T10:13:35Z)
EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning [10.266351600604612]
This paper introduces a framework, called EMOTION, for generating expressive motion sequences in humanoid robots. We conduct online user studies comparing the naturalness and understandability of the motions generated by EMOTION and its human-feedback version, EMOTION++.
arXiv Detail & Related papers (2024-10-30T17:22:45Z)
Hierarchical Procedural Framework for Low-latency Robot-Assisted Hand-Object Interaction [45.256762954338704]
We propose a hierarchical procedural framework to enable robot-assisted hand-object interaction (HOI)<n>An open-loop hierarchy leverages the RGB-based 3D reconstruction of the human hand, based on which motion primitives have been designed to translate hand motions into robotic actions.<n>A case study of ring-wearing tasks indicates the potential application of this work in assistive technologies such as healthcare and manufacturing.
arXiv Detail & Related papers (2024-05-29T21:20:16Z)
PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation [19.507619255773125]
We propose a Forward Dynamics Guided 4D Imitation method to generate physically plausible human-like reactions. The learned policy is capable of generating physically plausible and human-like reactions in real-time, significantly improving the speed(x33) and quality of reactions.
arXiv Detail & Related papers (2024-04-01T12:21:56Z)
ReGenNet: Towards Human Action-Reaction Synthesis [87.57721371471536]
We analyze the asymmetric, dynamic, synchronous, and detailed nature of human-human interactions. We propose the first multi-setting human action-reaction benchmark to generate human reactions conditioned on given human actions.
arXiv Detail & Related papers (2024-03-18T15:33:06Z)
Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction [9.806227900768926]
We propose to model social motion forecasting in a shared human-robot representation space. ECHO operates in the aforementioned shared space to predict the future motions of the agents encountered in social scenarios. We evaluate our model in multi-person and human-robot motion forecasting tasks and obtain state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2024-02-07T11:37:14Z)
ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions [66.87211993793807]
We present ReMoS, a denoising diffusion based model that synthesizes full body motion of a person in two person interaction scenario. We demonstrate ReMoS across challenging two person scenarios such as pair dancing, Ninjutsu, kickboxing, and acrobatics. We also contribute the ReMoCap dataset for two person interactions containing full body and finger motions.
arXiv Detail & Related papers (2023-11-28T18:59:52Z)
CG-HOI: Contact-Guided 3D Human-Object Interaction Generation [29.3564427724612]
We propose CG-HOI, the first method to generate dynamic 3D human-object interactions (HOIs) from text. We model the motion of both human and object in an interdependent fashion, as semantically rich human motion rarely happens in isolation. We show that our joint contact-based human-object interaction approach generates realistic and physically plausible sequences.
arXiv Detail & Related papers (2023-11-27T18:59:10Z)
InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint [67.6297384588837]
We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs. We demonstrate that the distance between joint pairs for human-wise interactions can be generated using an off-the-shelf Large Language Model.
arXiv Detail & Related papers (2023-11-27T14:32:33Z)
Scene-aware Generative Network for Human Motion Synthesis [125.21079898942347]
We propose a new framework, with the interaction between the scene and the human motion taken into account. Considering the uncertainty of human motion, we formulate this task as a generative task. We derive a GAN based learning approach, with discriminators to enforce the compatibility between the human motion and the contextual scene.
arXiv Detail & Related papers (2021-05-31T09:05:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.