Personality Profiling: How informative are social media profiles in
predicting personal information?
- URL: http://arxiv.org/abs/2309.13065v1
- Date: Fri, 15 Sep 2023 03:09:43 GMT
- Title: Personality Profiling: How informative are social media profiles in
predicting personal information?
- Authors: Joshua Watt, Jonathan Tuke and Lewis Mitchell
- Abstract summary: Personality profiling has been utilised by companies for targeted advertising, political campaigns and vaccine campaigns.
We aim to explore the extent to which peoples' online digital footprints can be used to profile their Myers-Briggs personality type.
- Score: 0.046040036610482664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Personality profiling has been utilised by companies for targeted
advertising, political campaigns and vaccine campaigns. However, the accuracy
and versatility of such models still remains relatively unknown. Consequently,
we aim to explore the extent to which peoples' online digital footprints can be
used to profile their Myers-Briggs personality type. We analyse and compare the
results of four models: logistic regression, naive Bayes, support vector
machines (SVMs) and random forests. We discover that a SVM model achieves the
best accuracy of 20.95% for predicting someones complete personality type.
However, logistic regression models perform only marginally worse and are
significantly faster to train and perform predictions. We discover that many
labelled datasets present substantial class imbalances of personal
characteristics on social media, including our own. As a result, we highlight
the need for attentive consideration when reporting model performance on these
datasets and compare a number of methods for fixing the class-imbalance
problems. Moreover, we develop a statistical framework for assessing the
importance of different sets of features in our models. We discover some
features to be more informative than others in the Intuitive/Sensory (p =
0.032) and Thinking/Feeling (p = 0.019) models. While we apply these methods to
Myers-Briggs personality profiling, they could be more generally used for any
labelling of individuals on social media.
Related papers
- Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Large Language Models Can Infer Psychological Dispositions of Social Media Users [1.0923877073891446]
We test whether GPT-3.5 and GPT-4 can derive the Big Five personality traits from users' Facebook status updates in a zero-shot learning scenario.
Our results show an average correlation of r =.29 (range = [.22,.33]) between LLM-inferred and self-reported trait scores.
predictions were found to be more accurate for women and younger individuals on several traits, suggesting a potential bias stemming from the underlying training data or differences in online self-expression.
arXiv Detail & Related papers (2023-09-13T01:27:48Z) - Personality Detection and Analysis using Twitter Data [7.584657555037871]
We release the largest automatically curated dataset for the research community.
This dataset has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task.
We show how our intriguing analysis results often follow natural intuition.
arXiv Detail & Related papers (2023-09-11T14:39:04Z) - On the Connection between Pre-training Data Diversity and Fine-tuning
Robustness [66.30369048726145]
We find that the primary factor influencing downstream effective robustness is data quantity.
We demonstrate our findings on pre-training distributions drawn from various natural and synthetic data sources.
arXiv Detail & Related papers (2023-07-24T05:36:19Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Metrics for Dataset Demographic Bias: A Case Study on Facial Expression Recognition [4.336779198334903]
One of the most prominent types of demographic bias are statistical imbalances in the representation of demographic groups in the datasets.
We develop a taxonomy for the classification of these metrics, providing a practical guide for the selection of appropriate metrics.
The paper provides valuable insights for researchers in AI and related fields to mitigate dataset bias and improve the fairness and accuracy of AI models.
arXiv Detail & Related papers (2023-03-28T11:04:18Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Two-Faced Humans on Twitter and Facebook: Harvesting Social Multimedia
for Human Personality Profiling [74.83957286553924]
We infer the Myers-Briggs Personality Type indicators by applying a novel multi-view fusion framework, called "PERS"
Our experimental results demonstrate the PERS's ability to learn from multi-view data for personality profiling by efficiently leveraging on the significantly different data arriving from diverse social multimedia sources.
arXiv Detail & Related papers (2021-06-20T10:48:49Z) - My tweets bring all the traits to the yard: Predicting personality and
relational traits in Online Social Networks [4.095574580512599]
This study aims to provide a prediction model for a holistic personality profiling in Online Social Networks (OSNs)
We first designed a feature engineering methodology that extracts a wide range of features from OSN accounts of users.
Then, we designed a machine learning model that predicts scores for the psychological traits of the users based on the extracted features.
arXiv Detail & Related papers (2020-09-22T20:30:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.