Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
- URL: http://arxiv.org/abs/2505.00049v1
- Date: Wed, 30 Apr 2025 06:09:40 GMT
- Title: Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
- Authors: Wenhan Dong, Yuemeng Zhao, Zhen Sun, Yule Liu, Zifan Peng, Jingyi Zheng, Zongmin Zhang, Ziyi Zhang, Jun Wu, Ruiming Wang, Shengmin Xu, Xinyi Huang, Xinlei He,
- Abstract summary: Large language models (LLMs) are increasingly used in human-centered tasks.<n>Assessing their psychological traits is crucial for understanding their social impact and ensuring trustworthy AI alignment.<n>This study aims to propose future directions for developing more interpretable, robust, and generalizable psychological assessment frameworks for LLMs.
- Score: 25.38031971196831
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As large language models (LLMs) are increasingly used in human-centered tasks, assessing their psychological traits is crucial for understanding their social impact and ensuring trustworthy AI alignment. While existing reviews have covered some aspects of related research, several important areas have not been systematically discussed, including detailed discussions of diverse psychological tests, LLM-specific psychological datasets, and the applications of LLMs with psychological traits. To address this gap, we systematically review six key dimensions of applying psychological theories to LLMs: (1) assessment tools; (2) LLM-specific datasets; (3) evaluation metrics (consistency and stability); (4) empirical findings; (5) personality simulation methods; and (6) LLM-based behavior simulation. Our analysis highlights both the strengths and limitations of current methods. While some LLMs exhibit reproducible personality patterns under specific prompting schemes, significant variability remains across tasks and settings. Recognizing methodological challenges such as mismatches between psychological tools and LLMs' capabilities, as well as inconsistencies in evaluation practices, this study aims to propose future directions for developing more interpretable, robust, and generalizable psychological assessment frameworks for LLMs.
Related papers
- Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A Survey [2.572335031488049]
This survey explores the development of meta-thinking capabilities in Large Language Models (LLMs) from a Multi-Agent Reinforcement Learning (MARL) perspective.
By exploring reward mechanisms, self-play, and continuous learning methods in MARL, this survey gives a comprehensive roadmap to building introspective, adaptive, and trustworthy LLMs.
arXiv Detail & Related papers (2025-04-20T07:34:26Z) - Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs [8.920202114368843]
We present an investigative study on how Mental Sets influence the reasoning capabilities of LLMs.<n>Mental Sets refers to the tendency to persist with previously successful strategies, even when they become inefficient.<n>We compare the performance of LLM models like Llama-3.1-8B-Instruct, Llama-3.1-70B-Instruct and GPT-4o in the presence of mental sets.
arXiv Detail & Related papers (2025-01-21T02:29:15Z) - MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs [97.94579295913606]
Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia.
In the development process, evaluation is critical since it provides intuitive feedback and guidance on improving models.
This work aims to offer researchers an easy grasp of how to effectively evaluate MLLMs according to different needs and to inspire better evaluation methods.
arXiv Detail & Related papers (2024-11-22T18:59:54Z) - A Survey on Human-Centric LLMs [11.49752599240738]
Large language models (LLMs) can simulate human cognition and behavior.
This survey focuses on their performance in both individual tasks and collective tasks.
arXiv Detail & Related papers (2024-11-20T12:34:44Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models [57.518784855080334]
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants.
This paper presents a framework for investigating psychology dimension in LLMs, including psychological identification, assessment dataset curation, and assessment with results validation.
We introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence.
arXiv Detail & Related papers (2024-06-25T16:09:08Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - Is Self-knowledge and Action Consistent or Not: Investigating Large Language Model's Personality [11.660251022962141]
We investigate the validity of conventional personality questionnaires in capturing the human-like personality traits of Large Language Models (LLMs)<n>Our objective is to assess the congruence between the personality traits LLMs claim to possess and their demonstrated tendencies in real-world scenarios.
arXiv Detail & Related papers (2024-02-22T16:32:08Z) - Personality Traits in Large Language Models [42.31355340867784]
Personality is a key factor determining the effectiveness of communication.<n>We present a novel and comprehensive psychometrically valid and reliable methodology for administering and validating personality tests on widely-used large language models.<n>We discuss the application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.
arXiv Detail & Related papers (2023-07-01T00:58:51Z) - Revisiting the Reliability of Psychological Scales on Large Language Models [62.57981196992073]
This study aims to determine the reliability of applying personality assessments to Large Language Models.
Analysis of 2,500 settings per model, including GPT-3.5, GPT-4, Gemini-Pro, and LLaMA-3.1, reveals that various LLMs show consistency in responses to the Big Five Inventory.
arXiv Detail & Related papers (2023-05-31T15:03:28Z) - Evaluating and Inducing Personality in Pre-trained Language Models [78.19379997967191]
We draw inspiration from psychometric studies by leveraging human personality theory as a tool for studying machine behaviors.
To answer these questions, we introduce the Machine Personality Inventory (MPI) tool for studying machine behaviors.
MPI follows standardized personality tests, built upon the Big Five Personality Factors (Big Five) theory and personality assessment inventories.
We devise a Personality Prompting (P2) method to induce LLMs with specific personalities in a controllable way.
arXiv Detail & Related papers (2022-05-20T07:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.