Fugu-MT 論文翻訳(概要): LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction

論文の概要: LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction

arxiv url: http://arxiv.org/abs/2509.07403v1
Date: Tue, 09 Sep 2025 05:32:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-10 14:38:27.188554
Title: LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Title（参考訳）: 長期的感情:長期的相互作用における大規模言語モデルの感情的知性の測定
Authors: Weichu Liu, Jing Xiong, Yuxuan Hu, Zixuan Li, Minghuan Tan, Ningning Mao, Chenyang Zhao, Zhongwei Wan, Chaofan Tao, Wendong Xu, Hui Shen, Chengming Li, Lingpeng Kong, Ngai Wong,
Abstract要約: LongEmotionは、ロングコンテキスト感情知能(EI)タスク用に特別に設計されたベンチマークである。感情分類、感情検出、感情QA、感情会話、感情概要、感情表現など、さまざまなタスクをカバーしている。現実的な制約下での性能を高めるため、検索型強化世代(RAG)と協調感情モデリング(CoEM)を取り入れた。
参考スコア（独自算出の注目度）: 72.19473883287948
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models (LLMs) make significant progress in Emotional Intelligence (EI) and long-context understanding. However, existing benchmarks tend to overlook certain aspects of EI in long-context scenarios, especially under realistic, practical settings where interactions are lengthy, diverse, and often noisy. To move towards such realistic settings, we present LongEmotion, a benchmark specifically designed for long-context EI tasks. It covers a diverse set of tasks, including Emotion Classification, Emotion Detection, Emotion QA, Emotion Conversation, Emotion Summary, and Emotion Expression. On average, the input length for these tasks reaches 8,777 tokens, with long-form generation required for Emotion Expression. To enhance performance under realistic constraints, we incorporate Retrieval-Augmented Generation (RAG) and Collaborative Emotional Modeling (CoEM), and compare them with standard prompt-based methods. Unlike conventional approaches, our RAG method leverages both the conversation context and the large language model itself as retrieval sources, avoiding reliance on external knowledge bases. The CoEM method further improves performance by decomposing the task into five stages, integrating both retrieval augmentation and limited knowledge injection. Experimental results show that both RAG and CoEM consistently enhance EI-related performance across most long-context tasks, advancing LLMs toward more practical and real-world EI applications. Furthermore, we conducted a comparative case study experiment on the GPT series to demonstrate the differences among various models in terms of EI. Code is available on GitHub at https://github.com/LongEmotion/LongEmotion, and the project page can be found at https://longemotion.github.io/.
Abstract（参考訳）: 大規模言語モデル(LLM)は感情知能(EI)と長文理解に大きな進歩をもたらす。しかし、既存のベンチマークは、特に対話が長く、多様で、しばしばノイズの多い現実的で実践的な環境では、EIの特定の側面を見落としがちである。このような現実的な設定に向けて、LongEmotionという長文EIタスク用に特別に設計されたベンチマークを提示する。感情分類、感情検出、感情QA、感情会話、感情概要、感情表現など、さまざまなタスクをカバーしている。平均すると、これらのタスクの入力長は8,777トークンに達し、感情表現には長文生成が必要である。現実的な制約下での性能向上を図るため,Retrieval-Augmented Generation (RAG) とCollaborative Emotional Modeling (CoEM) を統合し,標準的なプロンプトベース手法と比較した。従来の手法とは異なり、RAG法は会話コンテキストと大言語モデル自体を検索源として利用し、外部知識ベースへの依存を避ける。 CoEM法は,タスクを5段階に分解し,検索の強化と知識注入の制限を両立させることにより,パフォーマンスをさらに向上させる。実験結果から,RAGとCoEMは長文タスクのEI関連性能を継続的に向上し,LLMをより実用的で現実的なEIアプリケーションへと発展させることが明らかとなった。さらに, GPTシリーズのケーススタディ実験を行い, 各種モデルの違いをEIの観点から検証した。 GitHubではhttps://github.com/LongEmotion/LongEmotionで、プロジェクトのページはhttps://longemotion.github.io/で公開されている。

論文の概要: LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction

関連論文リスト