Fugu-MT 論文翻訳(概要): Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents

論文の概要: Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents

arxiv url: http://arxiv.org/abs/2510.04637v1
Date: Mon, 06 Oct 2025 09:41:37 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-07 16:52:59.785922
Title: Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents
Title（参考訳）: ソーシャルエージェント:会話型LLMエージェントによる非言語行動生成をマスターする
Authors: Zeyi Zhang, Yanju Zhou, Heyuan Yao, Tenglong Ao, Xiaohang Zhan, Libin Liu,
Abstract要約: ソーシャルエージェント(Social Agent)は、現実的で文脈的に適切な非言語行為をダイアディック会話で合成するための新しいフレームワークである。本研究では,Large Language Model (LLM) によって駆動されるエージェントシステムを構築し,会話の流れを指示し,双方の参加者に対して適切な対話行動を決定する。本稿では,音声信号から協調した動きを合成する自己回帰拡散モデルに基づく,新しい対人ジェスチャ生成モデルを提案する。
参考スコア（独自算出の注目度）: 13.902411927285328
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Social Agent, a novel framework for synthesizing realistic and contextually appropriate co-speech nonverbal behaviors in dyadic conversations. In this framework, we develop an agentic system driven by a Large Language Model (LLM) to direct the conversation flow and determine appropriate interactive behaviors for both participants. Additionally, we propose a novel dual-person gesture generation model based on an auto-regressive diffusion model, which synthesizes coordinated motions from speech signals. The output of the agentic system is translated into high-level guidance for the gesture generator, resulting in realistic movement at both the behavioral and motion levels. Furthermore, the agentic system periodically examines the movements of interlocutors and infers their intentions, forming a continuous feedback loop that enables dynamic and responsive interactions between the two participants. User studies and quantitative evaluations show that our model significantly improves the quality of dyadic interactions, producing natural, synchronized nonverbal behaviors.
Abstract（参考訳）: 本稿では,現実的かつ文脈的に適切な非言語行為をダイアディック会話で合成する新しいフレームワークであるSocial Agentを提案する。本研究では,Large Language Model (LLM) によって駆動されるエージェントシステムを構築し,会話の流れを指示し,双方の参加者に対して適切な対話行動を決定する。さらに,音声信号から協調した動きを合成する自己回帰拡散モデルに基づく,新しい対人ジェスチャ生成モデルを提案する。エージェントシステムの出力はジェスチャジェネレータの高レベルガイダンスに変換され、動作レベルと動作レベルの両方で現実的な動きをもたらす。さらに、エージェントシステムは、定期的にインターロケータの動きを調べ、その意図を推測し、2人の参加者間の動的かつ応答的な相互作用を可能にする継続的なフィードバックループを形成する。ユーザスタディと定量的評価により、我々のモデルはダイアド相互作用の質を著しく改善し、自然に同期された非言語行動を生み出すことが示された。

論文の概要: Social Agent: Mastering Dyadic Nonverbal Behavior Generation via Conversational LLM Agents

関連論文リスト