Fugu-MT 論文翻訳(概要): EmoBench: Evaluating the Emotional Intelligence of Large Language Models

論文の概要: EmoBench: Evaluating the Emotional Intelligence of Large Language Models

arxiv url: http://arxiv.org/abs/2402.12071v2
Date: Fri, 7 Jun 2024 07:43:45 GMT
ステータス: 翻訳完了
システム内更新日: 2024-06-10 19:57:35.454494
Title: EmoBench: Evaluating the Emotional Intelligence of Large Language Models
Title（参考訳）: EmoBench: 大規模言語モデルの感情知性を評価する
Authors: Sahand Sabour, Siyang Liu, Zheyuan Zhang, June M. Liu, Jinfeng Zhou, Alvionna S. Sunaryo, Juanzi Li, Tatia M. C. Lee, Rada Mihalcea, Minlie Huang,
Abstract要約: EmoBenchは、確立された心理学理論に基づいて、マシン感情知能(EI)の包括的な定義を提案するベンチマークである。 EmoBenchには、英語と中国語で400の手作りの質問が含まれている。以上の結果から,既存の大規模言語モデルのEIと平均的な人間の間には,かなりのギャップがみられ,今後の研究に向けての有望な方向性が浮かび上がっている。
参考スコア（独自算出の注目度）: 73.60839120040887
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Large Language Models (LLMs) have highlighted the need for robust, comprehensive, and challenging benchmarks. Yet, research on evaluating their Emotional Intelligence (EI) is considerably limited. Existing benchmarks have two major shortcomings: first, they mainly focus on emotion recognition, neglecting essential EI capabilities such as emotion regulation and thought facilitation through emotion understanding; second, they are primarily constructed from existing datasets, which include frequent patterns, explicit information, and annotation errors, leading to unreliable evaluation. We propose EmoBench, a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine EI, including Emotional Understanding and Emotional Application. EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding. Our findings reveal a considerable gap between the EI of existing LLMs and the average human, highlighting a promising direction for future research. Our code and data are publicly available at https://github.com/Sahandfer/EmoBench.
Abstract（参考訳）: 大規模言語モデル(LLM)の最近の進歩は、堅牢で包括的で挑戦的なベンチマークの必要性を強調している。しかし、感情知性(EI)を評価する研究は極めて限られている。既存のベンチマークには2つの大きな欠点がある。ひとつは感情認識、もうひとつは感情の制御や感情理解による思考促進といった重要なEI機能を無視し、もうひとつは、頻繁なパターン、明示的な情報、アノテーションエラーを含む既存のデータセットから構築され、信頼できない評価をもたらす。 EmoBenchは、確立された心理学理論を基礎として、感情理解や情緒的応用を含む、マシンEIの包括的定義を提案する。 EmoBenchには、英語と中国語で400の手作りの質問が含まれている。以上の結果から,既存のLDMのEIと平均的な人間の間には,かなりのギャップが見られ,今後の研究への期待が浮かび上がっている。私たちのコードとデータはhttps://github.com/Sahandfer/EmoBench.comで公開されています。

関連論文リスト

EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition [18.8101367995391]
EmoNet FaceはAIシステムの開発と評価のための総合的なベンチマークスイートである。新たな40カテゴリの感情分類法は、人間の感情経験のより詳細な詳細を捉えている。明示的で完全な表情を持つ3つの大規模なAI生成データセット。 EmpathicInsight-Faceは、私たちのベンチマークで人間レベルのパフォーマンスを達成するモデルです。
論文参考訳（メタデータ） (2025-05-26T14:19:58Z)
AI with Emotions: Exploring Emotional Expressions in Large Language Models [0.0]
大きな言語モデル(LLM)は、特定の感情状態で質問に答えるエージェントとしてロールプレイを行う。ラッセルの「サイクムプレックス」モデルは、眠気(覚醒)と快楽(静寂)の軸に沿った感情を特徴づける。評価の結果, 生成した回答の感情状態は, 仕様と一致していた。
論文参考訳（メタデータ） (2025-04-20T18:49:25Z)
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models [35.24458725308099]
感情の反応を駆動する因果的要因に着目した感情解釈(EI)を提案する。従来の感情認識とは異なり、EIタスクは単なるラベル付けではなくトリガーについての推論を必要とする。 EIBenchは1,615の基本的なEIサンプルと50の複雑なEIサンプルを含む大規模なベンチマークである。
論文参考訳（メタデータ） (2025-04-10T07:33:49Z)
EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models [27.195518991292488]
EmoBench-Mは、マルチモーダル大言語モデル(MLLM)の感情知能(EI)能力を評価するために設計された新しいベンチマークである。 EmoBench-M上でのオープンソースとクローズドソース両方のMLLMの評価は、彼らと人間の間に大きなパフォーマンスギャップがあることを示している。
論文参考訳（メタデータ） (2025-02-06T18:13:35Z)
MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis [53.012111671763776]
そこで本研究では、7,145枚の肖像画からなる総合的なベンチマークであるMEMO-Benchを紹介した。以上の結果から,既存のT2Iモデルは負のモデルよりも肯定的な感情を生成するのに効果的であることが示唆された。 MLLMは人間の感情の識別と認識に一定の効果を示すが、人間のレベルの正確さには欠ける。
論文参考訳（メタデータ） (2024-11-18T02:09:48Z)
Expansion Quantization Network: An Efficient Micro-emotion Annotation and Detection Framework [2.0209172586699173]
本稿では,ラベル値をエネルギー強度レベルにマッピングする全ラベルおよびトレーニングセットラベル回帰法を提案する。これにより、マイクロ感情検出とアノテーションのための感情量子化ネットワーク(EQN)フレームワークが確立された。 EQNフレームワークは、エネルギーレベルスコアで自動マイクロ感情アノテーションを実現する最初のフレームワークである。
論文参考訳（メタデータ） (2024-11-09T12:09:26Z)
EmoLLM: Multimodal Emotional Understanding Meets Large Language Models [61.179731667080326]
マルチモーダル・大規模言語モデル(MLLM)は、目的とするマルチモーダル認識タスクにおいて顕著な性能を達成している。しかし、主観的、感情的にニュアンスのあるマルチモーダルコンテンツを解釈する能力はほとんど解明されていない。 EmoLLMは、マルチモーダルな感情理解のための新しいモデルであり、2つのコア技術が組み込まれている。
論文参考訳（メタデータ） (2024-06-24T08:33:02Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
大規模言語モデル(LLM)は、問題解決と意思決定の能力の向上を示している。本稿ではメタ推論技術を必要とするプロセスベースのベンチマークMR-Benを提案する。メタ推論のパラダイムは,システム2のスロー思考に特に適しています。
論文参考訳（メタデータ） (2024-06-20T03:50:23Z)
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence [41.711534277034374]
感情インテリジェンス(EI)は、現在の大言語モデル(LLM)ベースの会話型汎用AIアシスタントのユーザインタラクションエクスペリエンスを改善する上で、重要な役割を果たす。これまでの研究は主に、EI関連分類や回帰タスクの微調整による感情知覚能力の向上に重点を置いていた。タスク命令付きテキスト・ツー・テキスト生成におけるEI関連タスクの大規模コレクションであるtextscEiBenchを紹介する。 UnderlinetextbfModular UnderlinetextbfEmotional Underline
論文参考訳（メタデータ） (2024-02-15T16:36:04Z)
Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought [50.13429055093534]
大規模言語モデル(LLM)は様々な感情認識タスクにおいて顕著な性能を示した。本研究では,感情生成タスクにおけるLLMの性能を高めるための感情連鎖(ECoT)を提案する。
論文参考訳（メタデータ） (2024-01-12T16:42:10Z)
Emotional Intelligence of Large Language Models [9.834823298632374]
大規模言語モデル(LLM)は多くの分野において顕著な能力を示している。しかし、現実世界の応用にとって重要な人間の感情や価値観との整合性は、体系的に評価されていない。そこで我々は,感情認識,解釈,理解を含むLLMの感情知能(EI)を評価した。
論文参考訳（メタデータ） (2023-07-18T07:49:38Z)
Brain in a Vat: On Missing Pieces Towards Artificial General Intelligence in Large Language Models [83.63242931107638]
本稿では,知的エージェントの4つの特徴について述べる。実世界の物体との活発な関わりは、概念的表現を形成するためのより堅牢な信号をもたらすと我々は主張する。我々は、人工知能分野における将来的な研究の方向性を概説して結論付ける。
論文参考訳（メタデータ） (2023-07-07T13:58:16Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。