Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits
- URL: http://arxiv.org/abs/2412.12510v1
- Date: Tue, 17 Dec 2024 03:46:51 GMT
- Title: Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits
- Authors: Bohan Li, Jiannan Guan, Longxu Dou, Yunlong Feng, Dingzirui Wang, Yang Xu, Enbo Wang, Qiguang Chen, Bichen Wang, Xiao Xu, Yimeng Zhang, Libo Qin, Yanyan Zhao, Qingfu Zhu, Wanxiang Che,
- Abstract summary: We optimize the task by constructing MBTIBench, the first manually annotated high-quality MBTI personality detection dataset with soft labels.
As for the first challenge, MBTIBench effectively solves the incorrect labeling issues, which account for 29.58% of the data.
The obtained soft labels confirm that there are more people with non-extreme personality traits.
- Score: 57.183688699002836
- License:
- Abstract: The Myers-Briggs Type Indicator (MBTI) is one of the most influential personality theories reflecting individual differences in thinking, feeling, and behaving. MBTI personality detection has garnered considerable research interest and has evolved significantly over the years. However, this task tends to be overly optimistic, as it currently does not align well with the natural distribution of population personality traits. Specifically, (1) the self-reported labels in existing datasets result in incorrect labeling issues, and (2) the hard labels fail to capture the full range of population personality distributions. In this paper, we optimize the task by constructing MBTIBench, the first manually annotated high-quality MBTI personality detection dataset with soft labels, under the guidance of psychologists. As for the first challenge, MBTIBench effectively solves the incorrect labeling issues, which account for 29.58% of the data. As for the second challenge, we estimate soft labels by deriving the polarity tendency of samples. The obtained soft labels confirm that there are more people with non-extreme personality traits. Experimental results not only highlight the polarized predictions and biases in LLMs as key directions for future research, but also confirm that soft labels can provide more benefits to other psychological tasks than hard labels. The code and data are available at https://github.com/Personality-NLP/MbtiBench.
Related papers
- A Chinese Multi-label Affective Computing Dataset Based on Social Media Network Users [2.0209172586699173]
This study collected data from the major social media platform Weibo, screening 11,338 valid users from over 50,000 individuals with diverse MBTI personality labels.
We compiled a multi-label Chinese affective computing dataset that integrates the same user's personality traits with six emotions and micro-emotions, each annotated with intensity levels.
This dataset is designed to advance machine recognition of complex human emotions and provide data support for research in psychology, education, marketing, finance, and politics.
arXiv Detail & Related papers (2024-11-13T05:38:55Z) - Neuron-based Personality Trait Induction in Large Language Models [115.08894603023712]
Large language models (LLMs) have become increasingly proficient at simulating various personality traits.
We present a neuron-based approach for personality trait induction in LLMs.
arXiv Detail & Related papers (2024-10-16T07:47:45Z) - Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues [63.936654900356004]
Personality recognition aims to identify the personality traits implied in user data such as dialogues and social media posts.
We propose a novel task named Explainable Personality Recognition, aiming to reveal the reasoning process as supporting evidence of the personality trait.
arXiv Detail & Related papers (2024-09-29T14:41:43Z) - EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection [19.98674724777821]
We propose a new personality detection method called EERPD.
This method introduces the use of emotion regulation, a psychological concept highly correlated with personality, for personality prediction.
Experimental results demonstrate that EERPD significantly enhances the accuracy and robustness of personality detection.
arXiv Detail & Related papers (2024-06-23T11:18:55Z) - LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Do LLMs Possess a Personality? Making the MBTI Test an Amazing
Evaluation for Large Language Models [2.918940961856197]
We aim to investigate the feasibility of using the Myers-Briggs Type Indicator (MBTI), a widespread human personality assessment tool, as an evaluation metric for large language models (LLMs)
Specifically, experiments will be conducted to explore: 1) the personality types of different LLMs, 2) the possibility of changing the personality types by prompt engineering, and 3) How does the training dataset affect the model's personality.
arXiv Detail & Related papers (2023-07-30T09:34:35Z) - Exploring Personality and Online Social Engagement: An Investigation of
MBTI Users on Twitter [0.0]
We investigate 3848 profiles from Twitter with self-labeled Myers-Briggs personality traits (MBTI)
We leverage BERT, a state-of-the-art NLP architecture based on deep learning, to analyze various sources of text that hold most predictive power for our task.
We find that biographies, statuses, and liked tweets contain significant predictive power for all dimensions of the MBTI system.
arXiv Detail & Related papers (2021-09-14T02:26:30Z) - Two-Faced Humans on Twitter and Facebook: Harvesting Social Multimedia
for Human Personality Profiling [74.83957286553924]
We infer the Myers-Briggs Personality Type indicators by applying a novel multi-view fusion framework, called "PERS"
Our experimental results demonstrate the PERS's ability to learn from multi-view data for personality profiling by efficiently leveraging on the significantly different data arriving from diverse social multimedia sources.
arXiv Detail & Related papers (2021-06-20T10:48:49Z) - Matching Theory and Data with Personal-ITY: What a Corpus of Italian
YouTube Comments Reveals About Personality [11.38723572165938]
We create a novel corpus of YouTube comments in Italian, where authors are labelled with personality traits.
The traits are derived from one of the mainstream personality theories in psychology research, named MBTI.
We study the task of personality prediction in itself on our corpus as well as on TwiSty.
arXiv Detail & Related papers (2020-11-11T12:45:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.