InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context
- URL: http://arxiv.org/abs/2406.12203v3
- Date: Sun, 03 Nov 2024 16:15:22 GMT
- Title: InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context
- Authors: Ziyi Liu, Abhishek Anand, Pei Zhou, Jen-tse Huang, Jieyu Zhao,
- Abstract summary: Large language models (LLMs) have demonstrated the potential to mimic human social intelligence.
We develop a novel framework, InterIntent, to assess LLMs' social intelligence by mapping their ability to understand and manage intentions in a game setting.
- Score: 27.740204336800687
- License:
- Abstract: Large language models (LLMs) have demonstrated the potential to mimic human social intelligence. However, most studies focus on simplistic and static self-report or performance-based tests, which limits the depth and validity of the analysis. In this paper, we developed a novel framework, InterIntent, to assess LLMs' social intelligence by mapping their ability to understand and manage intentions in a game setting. We focus on four dimensions of social intelligence: situational awareness, self-regulation, self-awareness, and theory of mind. Each dimension is linked to a specific game task: intention selection, intention following, intention summarization, and intention guessing. Our findings indicate that while LLMs exhibit high proficiency in selecting intentions, achieving an accuracy of 88%, their ability to infer the intentions of others is significantly weaker, trailing human performance by 20%. Additionally, game performance correlates with intention understanding, highlighting the importance of the four components towards success in this game. These findings underline the crucial role of intention understanding in evaluating LLMs' social intelligence and highlight the potential of using social deduction games as a complex testbed to enhance LLM evaluation. InterIntent contributes a structured approach to bridging the evaluation gap in social intelligence within multiplayer games.
Related papers
- Entering Real Social World! Benchmarking the Theory of Mind and Socialization Capabilities of LLMs from a First-person Perspective [22.30892836263764]
In the era of artificial intelligence (AI), especially with the development of large language models (LLMs), we raise an intriguing question.
How do LLMs perform in terms of ToM and socialization capabilities?
We introduce EgoSocialArena, a novel framework designed to evaluate and investigate the ToM and socialization capabilities of LLMs from a first person perspective.
arXiv Detail & Related papers (2024-10-08T16:55:51Z) - Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models [57.518784855080334]
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants.
This paper presents a framework for investigating psychology dimension in LLMs, including psychological identification, assessment dataset curation, and assessment with results validation.
We introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence.
arXiv Detail & Related papers (2024-06-25T16:09:08Z) - LLM Theory of Mind and Alignment: Opportunities and Risks [0.0]
There is growing interest in whether large language models (LLMs) have theory of mind (ToM)
This paper identifies key areas in which LLM ToM will show up in human:LLM interactions at individual and group levels.
It lays out a broad spectrum of potential implications and suggests the most pressing areas for future research.
arXiv Detail & Related papers (2024-05-13T19:52:16Z) - SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents [73.35393511272791]
We propose an interactive learning method, SOTOPIA-$pi$, improving the social intelligence of language agents.
This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings.
arXiv Detail & Related papers (2024-03-13T17:17:48Z) - Academically intelligent LLMs are not necessarily socially intelligent [56.452845189961444]
The academic intelligence of large language models (LLMs) has made remarkable progress in recent times, but their social intelligence performance remains unclear.
Inspired by established human social intelligence frameworks, we have developed a standardized social intelligence test based on real-world social scenarios.
arXiv Detail & Related papers (2024-03-11T10:35:53Z) - PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents [68.50571379012621]
Psychological measurement is essential for mental health, self-understanding, and personal development.
PsychoGAT (Psychological Game AgenTs) achieves statistically significant excellence in psychometric metrics such as reliability, convergent validity, and discriminant validity.
arXiv Detail & Related papers (2024-02-19T18:00:30Z) - I Think, Therefore I am: Benchmarking Awareness of Large Language Models
Using AwareBench [20.909504977779978]
We introduce AwareBench, a benchmark designed to evaluate awareness in large language models (LLMs)
We categorize awareness in LLMs into five dimensions, including capability, mission, emotion, culture, and perspective.
Our experiments, conducted on 13 LLMs, reveal that the majority of them struggle to fully recognize their capabilities and missions while demonstrating decent social intelligence.
arXiv Detail & Related papers (2024-01-31T14:41:23Z) - Leveraging Word Guessing Games to Assess the Intelligence of Large
Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy''
We develop DEEP to evaluate LLMs' expression and disguising abilities.
We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z) - Emotional Intelligence of Large Language Models [9.834823298632374]
Large Language Models (LLMs) have demonstrated remarkable abilities across numerous disciplines.
However, their alignment with human emotions and values, which is critical for real-world applications, has not been systematically evaluated.
Here, we assessed LLMs' Emotional Intelligence (EI), encompassing emotion recognition, interpretation, and understanding.
arXiv Detail & Related papers (2023-07-18T07:49:38Z) - Large Language Models Understand and Can be Enhanced by Emotional
Stimuli [53.53886609012119]
We take the first step towards exploring the ability of Large Language Models to understand emotional stimuli.
Our experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts.
Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks.
arXiv Detail & Related papers (2023-07-14T00:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.