EmoBench: Evaluating the Emotional Intelligence of Large Language Models
- URL: http://arxiv.org/abs/2402.12071v3
- Date: Wed, 17 Jul 2024 05:30:58 GMT
- Title: EmoBench: Evaluating the Emotional Intelligence of Large Language Models
- Authors: Sahand Sabour, Siyang Liu, Zheyuan Zhang, June M. Liu, Jinfeng Zhou, Alvionna S. Sunaryo, Juanzi Li, Tatia M. C. Lee, Rada Mihalcea, Minlie Huang,
- Abstract summary: EmoBench is a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine Emotional Intelligence (EI)
EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding.
Our findings reveal a considerable gap between the EI of existing Large Language Models and the average human, highlighting a promising direction for future research.
- Score: 73.60839120040887
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in Large Language Models (LLMs) have highlighted the need for robust, comprehensive, and challenging benchmarks. Yet, research on evaluating their Emotional Intelligence (EI) is considerably limited. Existing benchmarks have two major shortcomings: first, they mainly focus on emotion recognition, neglecting essential EI capabilities such as emotion regulation and thought facilitation through emotion understanding; second, they are primarily constructed from existing datasets, which include frequent patterns, explicit information, and annotation errors, leading to unreliable evaluation. We propose EmoBench, a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine EI, including Emotional Understanding and Emotional Application. EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding. Our findings reveal a considerable gap between the EI of existing LLMs and the average human, highlighting a promising direction for future research. Our code and data are publicly available at https://github.com/Sahandfer/EmoBench.
Related papers
- EmoLLM: Multimodal Emotional Understanding Meets Large Language Models [61.179731667080326]
Multi-modal large language models (MLLMs) have achieved remarkable performance on objective multimodal perception tasks.
But their ability to interpret subjective, emotionally nuanced multimodal content remains largely unexplored.
EmoLLM is a novel model for multimodal emotional understanding, incorporating with two core techniques.
arXiv Detail & Related papers (2024-06-24T08:33:02Z) - Think out Loud: Emotion Deducing Explanation in Dialogues [57.90554323226896]
We propose a new task "Emotion Deducing Explanation in Dialogues" (EDEN)
EDEN recognizes emotion and causes in an explicitly thinking way.
It can help Large Language Models (LLMs) achieve better recognition of emotions and causes.
arXiv Detail & Related papers (2024-06-07T08:58:29Z) - Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence [41.711534277034374]
Emotional Intelligence (EI) plays critical roles in improving user interaction experience for the current large language model (LLM) based conversational general AI assistants.
Previous works mainly focus on raising the emotion perception ability of them via naive fine-tuning on EI-related classification or regression tasks.
We introduce textscEiBench, a large-scale collection of EI-related tasks in the text-to-text formation with task instructions.
A novel underlinetextbfModular underlinetextbfEmotional underline
arXiv Detail & Related papers (2024-02-15T16:36:04Z) - Enhancing Emotional Generation Capability of Large Language Models via
Emotional Chain-of-Thought [53.1230874584344]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks.
We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z) - Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.
To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity.
Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z) - Emotional Intelligence of Large Language Models [9.834823298632374]
Large Language Models (LLMs) have demonstrated remarkable abilities across numerous disciplines.
However, their alignment with human emotions and values, which is critical for real-world applications, has not been systematically evaluated.
Here, we assessed LLMs' Emotional Intelligence (EI), encompassing emotion recognition, interpretation, and understanding.
arXiv Detail & Related papers (2023-07-18T07:49:38Z) - Brain in a Vat: On Missing Pieces Towards Artificial General
Intelligence in Large Language Models [83.63242931107638]
We propose four characteristics of generally intelligent agents.
We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations.
We conclude by outlining promising future research directions in the field of artificial general intelligence.
arXiv Detail & Related papers (2023-07-07T13:58:16Z) - HICEM: A High-Coverage Emotion Model for Artificial Emotional
Intelligence [9.153146173929935]
Next-generation artificial emotional intelligence (AEI) is taking center stage to address users' desire for deeper, more meaningful human-machine interaction.
Unlike theory of emotion, which has been the historical focus in psychology, emotion models are a descriptive tools.
This work has broad implications in social robotics, human-machine interaction, mental healthcare, and computational psychology.
arXiv Detail & Related papers (2022-06-15T15:21:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.