Related papers: Personality Profiling: How informative are social media profiles in predicting personal information?

Personality Profiling: How informative are social media profiles in predicting personal information?

URL: http://arxiv.org/abs/2309.13065v2
Date: Sun, 24 Nov 2024 06:39:21 GMT
Title: Personality Profiling: How informative are social media profiles in predicting personal information?
Authors: Joshua Watt, Lewis Mitchell, Jonathan Tuke,
Abstract summary: We explore the extent to which peoples' online digital footprints can be used to profile their Myers-Briggs personality type. We compare four models: logistic regression, naive Bayes, support vector machines (SVMs) and random forests. A SVM model achieves the best accuracy of 20.95% for predicting a complete personality type.
Score: 0.04096453902709291
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Personality profiling has been utilised by companies for targeted advertising, political campaigns and public health campaigns. However, the accuracy and versatility of such models remains relatively unknown. Here we explore the extent to which peoples' online digital footprints can be used to profile their Myers-Briggs personality type. We analyse and compare four models: logistic regression, naive Bayes, support vector machines (SVMs) and random forests. We discover that a SVM model achieves the best accuracy of 20.95% for predicting a complete personality type. However, logistic regression models perform only marginally worse and are significantly faster to train and perform predictions. Moreover, we develop a statistical framework for assessing the importance of different sets of features in our models. We discover some features to be more informative than others in the Intuitive/Sensory (p = 0.032) and Thinking/Feeling (p = 0.019) models. Many labelled datasets present substantial class imbalances of personal characteristics on social media, including our own. We therefore highlight the need for attentive consideration when reporting model performance on such datasets and compare a number of methods to fix class-imbalance problems.

Related papers

When Machine Learning Gets Personal: Understanding Fairness of Personalized Models [5.002195711989324]
Personalization in machine learning involves tailoring models to individual users by incorporating personal attributes such as demographic or medical data. While personalization can improve prediction accuracy, it may also amplify biases and reduce explainability. This work introduces a unified framework to evaluate the impact of personalization on both prediction accuracy and explanation quality.
arXiv Detail & Related papers (2025-02-05T00:17:33Z)
Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data. Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data. We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z)
On the Connection between Pre-training Data Diversity and Fine-tuning Robustness [66.30369048726145]
We find that the primary factor influencing downstream effective robustness is data quantity. We demonstrate our findings on pre-training distributions drawn from various natural and synthetic data sources.
arXiv Detail & Related papers (2023-07-24T05:36:19Z)
Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition [99.7047087527422]
In this work, we demonstrate that competition can fundamentally alter the behavior of machine learning scaling trends. We find many settings where improving data representation quality decreases the overall predictive accuracy across users. At a conceptual level, our work suggests that favorable scaling trends for individual model-providers need not translate to downstream improvements in social welfare.
arXiv Detail & Related papers (2023-06-26T13:06:34Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data. Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z)
In the Eye of the Beholder: Robust Prediction with Causal User Modeling [27.294341513692164]
We propose a learning framework for relevance prediction that is robust to changes in the data distribution. Our key observation is that robustness can be obtained by accounting for how users causally perceive the environment.
arXiv Detail & Related papers (2022-06-01T11:33:57Z)
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? [0.2836066255205732]
We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models. We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin. We also found that deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts.
arXiv Detail & Related papers (2021-07-24T11:38:25Z)
Two-Faced Humans on Twitter and Facebook: Harvesting Social Multimedia for Human Personality Profiling [74.83957286553924]
We infer the Myers-Briggs Personality Type indicators by applying a novel multi-view fusion framework, called "PERS" Our experimental results demonstrate the PERS's ability to learn from multi-view data for personality profiling by efficiently leveraging on the significantly different data arriving from diverse social multimedia sources.
arXiv Detail & Related papers (2021-06-20T10:48:49Z)
My tweets bring all the traits to the yard: Predicting personality and relational traits in Online Social Networks [4.095574580512599]
This study aims to provide a prediction model for a holistic personality profiling in Online Social Networks (OSNs) We first designed a feature engineering methodology that extracts a wide range of features from OSN accounts of users. Then, we designed a machine learning model that predicts scores for the psychological traits of the users based on the extracted features.
arXiv Detail & Related papers (2020-09-22T20:30:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.