Is Personality Prediction Possible Based on Reddit Comments?
- URL: http://arxiv.org/abs/2408.16089v1
- Date: Wed, 28 Aug 2024 18:43:07 GMT
- Title: Is Personality Prediction Possible Based on Reddit Comments?
- Authors: Robert Deimann, Till Preidt, Shaptarshi Roy, Jan Stanicki,
- Abstract summary: In this assignment, we examine whether there is a correlation between the personality type of a person and the texts they wrote.
In order to do this, we aggregated datasets of Reddit comments labeled with the Myers-Briggs Type Indicator (MBTI) of the author and built different supervised classifiers based on BERT to try to predict the personality of an author given a text.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this assignment, we examine whether there is a correlation between the personality type of a person and the texts they wrote. In order to do this, we aggregated datasets of Reddit comments labeled with the Myers-Briggs Type Indicator (MBTI) of the author and built different supervised classifiers based on BERT to try to predict the personality of an author given a text. Despite experiencing issues with the unfiltered character of the dataset, we can observe potential in the classification.
Related papers
- Personality Style Recognition via Machine Learning: Identifying
Anaclitic and Introjective Personality Styles from Patients' Speech [6.3042597209752715]
We use natural language processing (NLP) and machine learning tools for classification.
We test this on a dataset of recorded clinical diagnostic interviews (CDI) on a sample of 79 patients diagnosed with major depressive disorder (MDD)
We find that automated classification with language-derived features (i.e., based on LIWC) significantly outperforms questionnaire-based classification models.
arXiv Detail & Related papers (2023-11-07T15:56:19Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Editing Personality for Large Language Models [73.59001811199823]
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs)
We construct PersonalityEdit, a new benchmark dataset to address this task.
arXiv Detail & Related papers (2023-10-03T16:02:36Z) - Personality Detection and Analysis using Twitter Data [7.584657555037871]
We release the largest automatically curated dataset for the research community.
This dataset has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task.
We show how our intriguing analysis results often follow natural intuition.
arXiv Detail & Related papers (2023-09-11T14:39:04Z) - Personality Understanding of Fictional Characters during Book Reading [81.68515671674301]
We present the first labeled dataset PersoNet for this problem.
Our novel annotation strategy involves annotating user notes from online reading apps as a proxy for the original books.
Experiments and human studies indicate that our dataset construction is both efficient and accurate.
arXiv Detail & Related papers (2023-05-17T12:19:11Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - Exploring Personality and Online Social Engagement: An Investigation of
MBTI Users on Twitter [0.0]
We investigate 3848 profiles from Twitter with self-labeled Myers-Briggs personality traits (MBTI)
We leverage BERT, a state-of-the-art NLP architecture based on deep learning, to analyze various sources of text that hold most predictive power for our task.
We find that biographies, statuses, and liked tweets contain significant predictive power for all dimensions of the MBTI system.
arXiv Detail & Related papers (2021-09-14T02:26:30Z) - Matching Theory and Data with Personal-ITY: What a Corpus of Italian
YouTube Comments Reveals About Personality [11.38723572165938]
We create a novel corpus of YouTube comments in Italian, where authors are labelled with personality traits.
The traits are derived from one of the mainstream personality theories in psychology research, named MBTI.
We study the task of personality prediction in itself on our corpus as well as on TwiSty.
arXiv Detail & Related papers (2020-11-11T12:45:33Z) - FIND: Human-in-the-Loop Debugging Deep Text Classifiers [55.135620983922564]
We propose FIND -- a framework which enables humans to debug deep learning text classifiers by disabling irrelevant hidden features.
Experiments show that by using FIND, humans can improve CNN text classifiers which were trained under different types of imperfect datasets.
arXiv Detail & Related papers (2020-10-10T12:52:53Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.