Fugu-MT 論文翻訳(概要): Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models

論文の概要: Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models

arxiv url: http://arxiv.org/abs/2512.19692v1
Date: Mon, 22 Dec 2025 18:59:50 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-23 18:54:32.899638
Title: Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models
Title（参考訳）: Interact2Ar:自己回帰拡散モデルによるフルボディヒューマン・ヒューマンインタラクション生成
Authors: Pablo Ruiz-Ponce, Sergio Escalera, José García-Rodríguez, Jiankang Deng, Rolandos Alexandros Potamias,
Abstract要約: テキスト条件付き自己回帰拡散モデルであるInteract2Arを導入する。ハンドキネマティクスは専用のパラレルブランチを通じて組み込まれ、高忠実度フルボディ生成を可能にする。我々のモデルは、時間的動きの合成、外乱へのリアルタイム適応、ディヤディックからマルチパーソンシナリオへの拡張など、一連のダウンストリームアプリケーションを可能にする。
参考スコア（独自算出の注目度）: 80.28579390566298
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Generating realistic human-human interactions is a challenging task that requires not only high-quality individual body and hand motions, but also coherent coordination among all interactants. Due to limitations in available data and increased learning complexity, previous methods tend to ignore hand motions, limiting the realism and expressivity of the interactions. Additionally, current diffusion-based approaches generate entire motion sequences simultaneously, limiting their ability to capture the reactive and adaptive nature of human interactions. To address these limitations, we introduce Interact2Ar, the first end-to-end text-conditioned autoregressive diffusion model for generating full-body, human-human interactions. Interact2Ar incorporates detailed hand kinematics through dedicated parallel branches, enabling high-fidelity full-body generation. Furthermore, we introduce an autoregressive pipeline coupled with a novel memory technique that facilitates adaptation to the inherent variability of human interactions using efficient large context windows. The adaptability of our model enables a series of downstream applications, including temporal motion composition, real-time adaptation to disturbances, and extension beyond dyadic to multi-person scenarios. To validate the generated motions, we introduce a set of robust evaluators and extended metrics designed specifically for assessing full-body interactions. Through quantitative and qualitative experiments, we demonstrate the state-of-the-art performance of Interact2Ar.
Abstract（参考訳）: 現実的な人間と人間の相互作用を生成することは、高品質な身体と手の動きだけでなく、すべての対話者間のコヒーレントな協調を必要とする難しいタスクである。利用可能なデータの制限と学習の複雑さの増大により、従来の手法は手の動きを無視し、相互作用の現実性と表現性を制限する傾向にある。さらに、現在の拡散に基づくアプローチは、人間の相互作用の反応性と適応性を捉える能力を制限し、同時に全運動列を生成する。これらの制約に対処するため、本論文では、人体と人体の相互作用を生成するための、最初のエンドツーエンドのテキスト条件付き自己回帰拡散モデルであるInteract2Arを紹介する。 Interact2Arは、専用のパラレルブランチを通じて詳細なハンドキネマティクスを組み込んで、高忠実度フルボディ生成を可能にする。さらに,より効率的な大規模コンテキストウィンドウを用いた人的相互作用の固有変数への適応を容易にする,新しいメモリ技術を組み合わせた自己回帰パイプラインを提案する。このモデルの適応性は、時間的動きの合成、外乱へのリアルタイム適応、ディヤディックからマルチパーソンシナリオへの拡張など、一連のダウンストリームアプリケーションを可能にする。生成した動作を検証するために,フルボディインタラクションの評価に特化して設計された,ロバストな評価器と拡張メトリクスを導入する。定量的および定性的な実験を通じて、Interact2Arの最先端性能を実証する。

論文の概要: Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models

関連論文リスト