Related papers: EmoBench: Evaluating the Emotional Intelligence of Large Language Models

EmoBench: Evaluating the Emotional Intelligence of Large Language Models

URL: http://arxiv.org/abs/2402.12071v3
Date: Wed, 17 Jul 2024 05:30:58 GMT
Title: EmoBench: Evaluating the Emotional Intelligence of Large Language Models
Authors: Sahand Sabour, Siyang Liu, Zheyuan Zhang, June M. Liu, Jinfeng Zhou, Alvionna S. Sunaryo, Juanzi Li, Tatia M. C. Lee, Rada Mihalcea, Minlie Huang,
Abstract summary: EmoBench is a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine Emotional Intelligence (EI) EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding. Our findings reveal a considerable gap between the EI of existing Large Language Models and the average human, highlighting a promising direction for future research.
Score: 73.60839120040887
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Large Language Models (LLMs) have highlighted the need for robust, comprehensive, and challenging benchmarks. Yet, research on evaluating their Emotional Intelligence (EI) is considerably limited. Existing benchmarks have two major shortcomings: first, they mainly focus on emotion recognition, neglecting essential EI capabilities such as emotion regulation and thought facilitation through emotion understanding; second, they are primarily constructed from existing datasets, which include frequent patterns, explicit information, and annotation errors, leading to unreliable evaluation. We propose EmoBench, a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine EI, including Emotional Understanding and Emotional Application. EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding. Our findings reveal a considerable gap between the EI of existing LLMs and the average human, highlighting a promising direction for future research. Our code and data are publicly available at https://github.com/Sahandfer/EmoBench.

Related papers

EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition [18.8101367995391]
EmoNet Face is a comprehensive benchmark suite for developing and evaluating AI systems.<n>A novel 40-category emotion taxonomy captures finer details of human emotional experiences.<n>Three large-scale, AI-generated datasets with explicit, full-face expressions.<n>EmpathicInsight-Face is a model achieving human-expert-level performance on our benchmark.
arXiv Detail & Related papers (2025-05-26T14:19:58Z)
AI with Emotions: Exploring Emotional Expressions in Large Language Models [0.0]
Large Language Models (LLMs) play role-play as agents answering questions with specified emotional states. Russell's Circumplex model characterizes emotions along the sleepy-activated (arousal) and pleasure-displeasure (valence) axes. evaluation showed that the emotional states of the generated answers were consistent with the specifications.
arXiv Detail & Related papers (2025-04-20T18:49:25Z)
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models [35.24458725308099]
We propose Emotion Interpretation (EI), focusing on causal factors that drive emotional responses. Unlike traditional emotion recognition, EI tasks require reasoning about triggers instead of mere labeling. We present EIBench, a large-scale benchmark encompassing 1,615 basic EI samples and 50 complex EI samples.
arXiv Detail & Related papers (2025-04-10T07:33:49Z)
EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models [27.195518991292488]
EmoBench-M is a novel benchmark designed to evaluate the emotional intelligence (EI) capability of Multimodal large language models (MLLMs) Evaluations of both open-source and closed-source MLLMs on EmoBench-M reveal a significant performance gap between them and humans.
arXiv Detail & Related papers (2025-02-06T18:13:35Z)
MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis [53.012111671763776]
This study introduces MEMO-Bench, a comprehensive benchmark consisting of 7,145 portraits, each depicting one of six different emotions. Results demonstrate that existing T2I models are more effective at generating positive emotions than negative ones. Although MLLMs show a certain degree of effectiveness in distinguishing and recognizing human emotions, they fall short of human-level accuracy.
arXiv Detail & Related papers (2024-11-18T02:09:48Z)
Expansion Quantization Network: An Efficient Micro-emotion Annotation and Detection Framework [2.0209172586699173]
We propose an all-labels and training-set label regression method to map label values to energy intensity levels. This led to the establishment of the Emotion Quantization Network (EQN) framework for micro-emotion detection and annotation. The EQN framework is the first to achieve automatic micro-emotion annotation with energy-level scores.
arXiv Detail & Related papers (2024-11-09T12:09:26Z)
EmoLLM: Multimodal Emotional Understanding Meets Large Language Models [61.179731667080326]
Multi-modal large language models (MLLMs) have achieved remarkable performance on objective multimodal perception tasks. But their ability to interpret subjective, emotionally nuanced multimodal content remains largely unexplored. EmoLLM is a novel model for multimodal emotional understanding, incorporating with two core techniques.
arXiv Detail & Related papers (2024-06-24T08:33:02Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making. We present a process-based benchmark MR-Ben that demands a meta-reasoning skill. Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z)
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence [41.711534277034374]
Emotional Intelligence (EI) plays critical roles in improving user interaction experience for the current large language model (LLM) based conversational general AI assistants. Previous works mainly focus on raising the emotion perception ability of them via naive fine-tuning on EI-related classification or regression tasks. We introduce textscEiBench, a large-scale collection of EI-related tasks in the text-to-text formation with task instructions. A novel underlinetextbfModular underlinetextbfEmotional underline
arXiv Detail & Related papers (2024-02-15T16:36:04Z)
Enhancing Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought [50.13429055093534]
Large Language Models (LLMs) have shown remarkable performance in various emotion recognition tasks. We propose the Emotional Chain-of-Thought (ECoT) to enhance the performance of LLMs on various emotional generation tasks.
arXiv Detail & Related papers (2024-01-12T16:42:10Z)
Emotional Intelligence of Large Language Models [9.834823298632374]
Large Language Models (LLMs) have demonstrated remarkable abilities across numerous disciplines. However, their alignment with human emotions and values, which is critical for real-world applications, has not been systematically evaluated. Here, we assessed LLMs' Emotional Intelligence (EI), encompassing emotion recognition, interpretation, and understanding.
arXiv Detail & Related papers (2023-07-18T07:49:38Z)
Brain in a Vat: On Missing Pieces Towards Artificial General Intelligence in Large Language Models [83.63242931107638]
We propose four characteristics of generally intelligent agents. We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations. We conclude by outlining promising future research directions in the field of artificial general intelligence.
arXiv Detail & Related papers (2023-07-07T13:58:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.