Personality Profiling: How informative are social media profiles in predicting personal information?
- URL: http://arxiv.org/abs/2309.13065v2
- Date: Sun, 24 Nov 2024 06:39:21 GMT
- Title: Personality Profiling: How informative are social media profiles in predicting personal information?
- Authors: Joshua Watt, Lewis Mitchell, Jonathan Tuke,
- Abstract summary: We explore the extent to which peoples' online digital footprints can be used to profile their Myers-Briggs personality type.
We compare four models: logistic regression, naive Bayes, support vector machines (SVMs) and random forests.
A SVM model achieves the best accuracy of 20.95% for predicting a complete personality type.
- Score: 0.04096453902709291
- License:
- Abstract: Personality profiling has been utilised by companies for targeted advertising, political campaigns and public health campaigns. However, the accuracy and versatility of such models remains relatively unknown. Here we explore the extent to which peoples' online digital footprints can be used to profile their Myers-Briggs personality type. We analyse and compare four models: logistic regression, naive Bayes, support vector machines (SVMs) and random forests. We discover that a SVM model achieves the best accuracy of 20.95% for predicting a complete personality type. However, logistic regression models perform only marginally worse and are significantly faster to train and perform predictions. Moreover, we develop a statistical framework for assessing the importance of different sets of features in our models. We discover some features to be more informative than others in the Intuitive/Sensory (p = 0.032) and Thinking/Feeling (p = 0.019) models. Many labelled datasets present substantial class imbalances of personal characteristics on social media, including our own. We therefore highlight the need for attentive consideration when reporting model performance on such datasets and compare a number of methods to fix class-imbalance problems.
Related papers
- When Machine Learning Gets Personal: Understanding Fairness of Personalized Models [5.002195711989324]
Personalization in machine learning involves tailoring models to individual users by incorporating personal attributes such as demographic or medical data.
While personalization can improve prediction accuracy, it may also amplify biases and reduce explainability.
This work introduces a unified framework to evaluate the impact of personalization on both prediction accuracy and explanation quality.
arXiv Detail & Related papers (2025-02-05T00:17:33Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - On the Connection between Pre-training Data Diversity and Fine-tuning
Robustness [66.30369048726145]
We find that the primary factor influencing downstream effective robustness is data quantity.
We demonstrate our findings on pre-training distributions drawn from various natural and synthetic data sources.
arXiv Detail & Related papers (2023-07-24T05:36:19Z) - Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition [99.7047087527422]
In this work, we demonstrate that competition can fundamentally alter the behavior of machine learning scaling trends.
We find many settings where improving data representation quality decreases the overall predictive accuracy across users.
At a conceptual level, our work suggests that favorable scaling trends for individual model-providers need not translate to downstream improvements in social welfare.
arXiv Detail & Related papers (2023-06-26T13:06:34Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - In the Eye of the Beholder: Robust Prediction with Causal User Modeling [27.294341513692164]
We propose a learning framework for relevance prediction that is robust to changes in the data distribution.
Our key observation is that robustness can be obtained by accounting for how users causally perceive the environment.
arXiv Detail & Related papers (2022-06-01T11:33:57Z) - Two-Faced Humans on Twitter and Facebook: Harvesting Social Multimedia
for Human Personality Profiling [74.83957286553924]
We infer the Myers-Briggs Personality Type indicators by applying a novel multi-view fusion framework, called "PERS"
Our experimental results demonstrate the PERS's ability to learn from multi-view data for personality profiling by efficiently leveraging on the significantly different data arriving from diverse social multimedia sources.
arXiv Detail & Related papers (2021-06-20T10:48:49Z) - My tweets bring all the traits to the yard: Predicting personality and
relational traits in Online Social Networks [4.095574580512599]
This study aims to provide a prediction model for a holistic personality profiling in Online Social Networks (OSNs)
We first designed a feature engineering methodology that extracts a wide range of features from OSN accounts of users.
Then, we designed a machine learning model that predicts scores for the psychological traits of the users based on the extracted features.
arXiv Detail & Related papers (2020-09-22T20:30:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.