Related papers: Multi-Person Interaction Generation from Two-Person Motion Priors

Multi-Person Interaction Generation from Two-Person Motion Priors

URL: http://arxiv.org/abs/2505.17860v2
Date: Sat, 26 Jul 2025 22:01:05 GMT
Title: Multi-Person Interaction Generation from Two-Person Motion Priors
Authors: Wenning Xu, Shiyu Fan, Paul Henderson, Edmond S. L. Ho,
Abstract summary: Graph-driven Interaction Sampling is a method that can generate realistic and diverse multi-person interactions.<n>We decompose the generation task into simultaneous single-person motion generation conditioned on one other's motion.<n>Our approach consistently outperforms existing methods in reducing artifacts when generating a wide range of two-person and multi-person interactions.
Score: 7.253302825595181
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Generating realistic human motion with high-level controls is a crucial task for social understanding, robotics, and animation. With high-quality MOCAP data becoming more available recently, a wide range of data-driven approaches have been presented. However, modelling multi-person interactions still remains a less explored area. In this paper, we present Graph-driven Interaction Sampling, a method that can generate realistic and diverse multi-person interactions by leveraging existing two-person motion diffusion models as motion priors. Instead of training a new model specific to multi-person interaction synthesis, our key insight is to spatially and temporally separate complex multi-person interactions into a graph structure of two-person interactions, which we name the Pairwise Interaction Graph. We thus decompose the generation task into simultaneous single-person motion generation conditioned on one other's motion. In addition, to reduce artifacts such as interpenetrations of body parts in generated multi-person interactions, we introduce two graph-dependent guidance terms into the diffusion sampling scheme. Unlike previous work, our method can produce various high-quality multi-person interactions without having repetitive individual motions. Extensive experiments demonstrate that our approach consistently outperforms existing methods in reducing artifacts when generating a wide range of two-person and multi-person interactions.

Related papers

SyncDiff: Synchronized Motion Diffusion for Multi-Body Human-Object Interaction Synthesis [22.14972920585117]
We introduce SyncDiff, a novel method for multi-body interaction synthesis using a synchronized motion diffusion strategy.<n>To enhance motion fidelity, we propose a frequency-domain motion decomposition scheme.<n>We also introduce a new set of alignment scores to emphasize the synchronization of different body motions.
arXiv Detail & Related papers (2024-12-28T10:12:12Z)
InterDance:Reactive 3D Dance Generation with Realistic Duet Interactions [67.37790144477503]
We propose InterDance, a large-scale duet dance dataset that significantly enhances motion quality, data scale, and the variety of dance genres.<n>We introduce a diffusion-based framework with an interaction refinement guidance strategy to optimize the realism of interactions progressively.
arXiv Detail & Related papers (2024-12-22T11:53:51Z)
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer [24.166147954731652]
Multi-person interactive motion generation is a critical yet under-explored domain in computer character animation.<n>Current research often employs separate module branches for individual motions, leading to a loss of interaction information.<n>We propose a novel, unified approach that models multi-person motions and their interactions within a single latent space.
arXiv Detail & Related papers (2024-12-21T15:35:50Z)
in2IN: Leveraging individual Information to Generate Human INteractions [29.495166514135295]
We introduce in2IN, a novel diffusion model for human-human motion generation conditioned on individual descriptions. We also propose DualMDM, a model composition technique that combines the motions generated with in2IN and the motions generated by a single-person motion prior pre-trained on HumanML3D.
arXiv Detail & Related papers (2024-04-15T17:59:04Z)
ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions [66.87211993793807]
We present ReMoS, a denoising diffusion based model that synthesizes full body motion of a person in two person interaction scenario. We demonstrate ReMoS across challenging two person scenarios such as pair dancing, Ninjutsu, kickboxing, and acrobatics. We also contribute the ReMoCap dataset for two person interactions containing full body and finger motions.
arXiv Detail & Related papers (2023-11-28T18:59:52Z)
InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint [67.6297384588837]
We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs. We demonstrate that the distance between joint pairs for human-wise interactions can be generated using an off-the-shelf Large Language Model.
arXiv Detail & Related papers (2023-11-27T14:32:33Z)
Persistent-Transient Duality: A Multi-mechanism Approach for Modeling Human-Object Interaction [58.67761673662716]
Humans are highly adaptable, swiftly switching between different modes to handle different tasks, situations and contexts. In Human-object interaction (HOI) activities, these modes can be attributed to two mechanisms: (1) the large-scale consistent plan for the whole activity and (2) the small-scale children interactive actions that start and end along the timeline. This work proposes to model two concurrent mechanisms that jointly control human motion.
arXiv Detail & Related papers (2023-07-24T12:21:33Z)
InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions [49.097973114627344]
We present InterGen, an effective diffusion-based approach that incorporates human-to-human interactions into the motion diffusion process. We first contribute a multimodal dataset, named InterHuman. It consists of about 107M frames for diverse two-person interactions, with accurate skeletal motions and 23,337 natural language descriptions. We propose a novel representation for motion input in our interaction diffusion model, which explicitly formulates the global relations between the two performers in the world frame.
arXiv Detail & Related papers (2023-04-12T08:12:29Z)
Bipartite Graph Diffusion Model for Human Interaction Generation [11.732108478773196]
We introduce a novel bipartite graph diffusion method (BiGraphDiff) to generate human motion interactions between two persons. We show that the proposed achieves new state-of-the-art results on leading benchmarks for the human interaction generation task.
arXiv Detail & Related papers (2023-01-24T16:59:46Z)
Interaction Transformer for Human Reaction Generation [61.22481606720487]
We propose a novel interaction Transformer (InterFormer) consisting of a Transformer network with both temporal and spatial attentions. Our method is general and can be used to generate more complex and long-term interactions.
arXiv Detail & Related papers (2022-07-04T19:30:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.