Personality Detection and Analysis using Twitter Data
- URL: http://arxiv.org/abs/2309.05497v1
- Date: Mon, 11 Sep 2023 14:39:04 GMT
- Title: Personality Detection and Analysis using Twitter Data
- Authors: Abhilash Datta, Souvic Chakraborty, Animesh Mukherjee
- Abstract summary: We release the largest automatically curated dataset for the research community.
This dataset has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task.
We show how our intriguing analysis results often follow natural intuition.
- Score: 7.584657555037871
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Personality types are important in various fields as they hold relevant
information about the characteristics of a human being in an explainable
format. They are often good predictors of a person's behaviors in a particular
environment and have applications ranging from candidate selection to marketing
and mental health. Recently automatic detection of personality traits from
texts has gained significant attention in computational linguistics. Most
personality detection and analysis methods have focused on small datasets
making their experimental observations often limited. To bridge this gap, we
focus on collecting and releasing the largest automatically curated dataset for
the research community which has 152 million tweets and 56 thousand data points
for the Myers-Briggs personality type (MBTI) prediction task. We perform a
series of extensive qualitative and quantitative studies on our dataset to
analyze the data patterns in a better way and infer conclusions. We show how
our intriguing analysis results often follow natural intuition. We also perform
a series of ablation studies to show how the baselines perform for our dataset.
Related papers
- LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced
Personality Detection Model [58.887561071010985]
Personality detection aims to detect one's personality traits underlying in social media posts.
Most existing methods learn post features directly by fine-tuning the pre-trained language models.
We propose a large language model (LLM) based text augmentation enhanced personality detection model.
arXiv Detail & Related papers (2024-03-12T12:10:18Z) - Personality Trait Inference Via Mobile Phone Sensors: A Machine Learning
Approach [0.0]
This study provides evidence that personality can be reliably predicted from activity data collected through mobile phone sensors.
We were able to predict users' personality up to a 0.78 F1 score on a two class problem.
We show how a combination of rich behavioral data obtained with smartphone sensing and the use of machine learning techniques can help to advance personality research.
arXiv Detail & Related papers (2024-01-18T13:18:51Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Dataset Bias in Human Activity Recognition [57.91018542715725]
This contribution statistically curates the training data to assess to what degree the physical characteristics of humans influence HAR performance.
We evaluate the performance of a state-of-the-art convolutional neural network on two HAR datasets that vary in the sensors, activities, and recording for time-series HAR.
arXiv Detail & Related papers (2023-01-19T12:33:50Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Exploring Personality and Online Social Engagement: An Investigation of
MBTI Users on Twitter [0.0]
We investigate 3848 profiles from Twitter with self-labeled Myers-Briggs personality traits (MBTI)
We leverage BERT, a state-of-the-art NLP architecture based on deep learning, to analyze various sources of text that hold most predictive power for our task.
We find that biographies, statuses, and liked tweets contain significant predictive power for all dimensions of the MBTI system.
arXiv Detail & Related papers (2021-09-14T02:26:30Z) - Two-Faced Humans on Twitter and Facebook: Harvesting Social Multimedia
for Human Personality Profiling [74.83957286553924]
We infer the Myers-Briggs Personality Type indicators by applying a novel multi-view fusion framework, called "PERS"
Our experimental results demonstrate the PERS's ability to learn from multi-view data for personality profiling by efficiently leveraging on the significantly different data arriving from diverse social multimedia sources.
arXiv Detail & Related papers (2021-06-20T10:48:49Z) - Personality Trait Detection Using Bagged SVM over BERT Word Embedding
Ensembles [10.425280599592865]
We present a novel deep learning-based approach for automated personality detection from text.
We leverage state of the art advances in natural language understanding, namely the BERT language model to extract contextualized word embeddings.
Our model outperforms the previous state of the art by 1.04% and, at the same time is significantly more computationally efficient to train.
arXiv Detail & Related papers (2020-10-03T09:25:51Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z) - Jointly Predicting Job Performance, Personality, Cognitive Ability,
Affect, and Well-Being [42.67003631848889]
We create a benchmark for predictive analysis of individuals from a perspective that integrates physical and physiological behavior, psychological states and traits, and job performance.
We design data mining techniques as benchmark and uses real noisy and incomplete data derived from wearable sensors to predict 19 constructs based on 12 standardized well-validated tests.
arXiv Detail & Related papers (2020-06-10T14:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.