Fugu-MT 論文翻訳(概要): HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs

論文の概要: HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs

arxiv url: http://arxiv.org/abs/2601.19839v1
Date: Tue, 27 Jan 2026 17:45:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-28 15:26:51.424818
Title: HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs
Title（参考訳）: HARMONI:LLMを用いたマルチユーザヒューマンロボットインタラクションのマルチモーダルパーソナライズ
Authors: Jeanne Malécot, Hamed Rahimi, Jeanne Cattoni, Marie Samson, Mouad Abrini, Mahdi Khoramshahi, Maribel Pino, Mohamed Chetouani,
Abstract要約: 本稿では,社会支援型ロボットによる長期マルチユーザインタラクション管理を実現するマルチモーダルパーソナライズフレームワークであるHARMONIを提案する。 i)アクティブな話者を識別し、マルチモーダルな入力を抽出する知覚モジュール、(ii)環境の表現を維持する世界モデリングモジュール、(iii)長期的な話者固有のプロファイルを更新するユーザモデリングモジュール、(iv)文脈的に基礎と倫理的に通知された応答を生成する生成モジュールである。
参考スコア（独自算出の注目度）: 1.4755786263360526
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing human-robot interaction systems often lack mechanisms for sustained personalization and dynamic adaptation in multi-user environments, limiting their effectiveness in real-world deployments. We present HARMONI, a multimodal personalization framework that leverages large language models to enable socially assistive robots to manage long-term multi-user interactions. The framework integrates four key modules: (i) a perception module that identifies active speakers and extracts multimodal input; (ii) a world modeling module that maintains representations of the environment and short-term conversational context; (iii) a user modeling module that updates long-term speaker-specific profiles; and (iv) a generation module that produces contextually grounded and ethically informed responses. Through extensive evaluation and ablation studies on four datasets, as well as a real-world scenario-driven user-study in a nursing home environment, we demonstrate that HARMONI supports robust speaker identification, online memory updating, and ethically aligned personalization, outperforming baseline LLM-driven approaches in user modeling accuracy, personalization quality, and user satisfaction.
Abstract（参考訳）: 既存の人間とロボットのインタラクションシステムには、持続的なパーソナライゼーションと動的適応のメカニズムが欠如しており、現実のデプロイメントにおけるその有効性を制限している。本稿では,大規模言語モデルを活用するマルチモーダルパーソナライズフレームワークであるHARMONIについて述べる。このフレームワークは4つの主要なモジュールを統合している。一能動話者を識別し、マルチモーダル入力を抽出する知覚モジュール二環境及び短期会話状況の表現を維持する世界モデリングモジュール三長期話者特定プロファイルを更新するユーザモデリングモジュール (四)文脈的に根拠と倫理的に情報を得た応答を生成する生成モジュール。介護老人ホーム環境における現実のシナリオ駆動型ユーザスタディと同様に,4つのデータセットの広範囲な評価とアブレーション研究を通じて,HARMONIが頑健な話者識別,オンラインメモリ更新,倫理的に整合したパーソナライゼーション,ユーザモデリング精度,パーソナライズ品質,ユーザ満足度において,ベースラインLLM駆動アプローチよりも優れたパフォーマンスを実現していることを示す。

論文の概要: HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs

関連論文リスト