An Attention-Based Denoising Framework for Personality Detection in
Social Media Texts
- URL: http://arxiv.org/abs/2311.09945v1
- Date: Thu, 16 Nov 2023 14:56:09 GMT
- Title: An Attention-Based Denoising Framework for Personality Detection in
Social Media Texts
- Authors: Qirui Tang, Wenkang Jiang, Yihua Du, Lei Lin
- Abstract summary: Personality detection based on user-generated texts is a universal method that can be used to build user portraits.
We propose an attention-based information extraction mechanism (AIEM) for long texts, which is applied to quickly locate valuable pieces of information.
We obtain an average accuracy improvement of 10.2% on the gold standard Twitter-Myers-Briggs Type Indicator dataset.
- Score: 1.4887196224762684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In social media networks, users produce a large amount of text content
anytime, providing researchers with a valuable approach to digging for
personality-related information. Personality detection based on user-generated
texts is a universal method that can be used to build user portraits. The
presence of noise in social media texts hinders personality detection. However,
previous studies have not fully addressed this challenge. Inspired by the
scanning reading technique, we propose an attention-based information
extraction mechanism (AIEM) for long texts, which is applied to quickly locate
valuable pieces of information, and focus more attention on the deep semantics
of key pieces. Then, we provide a novel attention-based denoising framework
(ADF) for personality detection tasks and achieve state-of-the-art performance
on two commonly used datasets. Notably, we obtain an average accuracy
improvement of 10.2% on the gold standard Twitter-Myers-Briggs Type Indicator
(Twitter-MBTI) dataset. We made our code publicly available on GitHub. We shed
light on how AIEM works to magnify personality-related signals.
Related papers
- Learning Robust Named Entity Recognizers From Noisy Data With Retrieval Augmentation [67.89838237013078]
Named entity recognition (NER) models often struggle with noisy inputs.
We propose a more realistic setting in which only noisy text and its NER labels are available.
We employ a multi-view training framework that improves robust NER without retrieving text during inference.
arXiv Detail & Related papers (2024-07-26T07:30:41Z) - Stellar: Systematic Evaluation of Human-Centric Personalized
Text-to-Image Methods [52.806258774051216]
We focus on text-to-image systems that input a single image of an individual and ground the generation process along with text describing the desired visual context.
We introduce a standardized dataset (Stellar) that contains personalized prompts coupled with images of individuals that is an order of magnitude larger than existing relevant datasets and where rich semantic ground-truth annotations are readily available.
We derive a simple yet efficient, personalized text-to-image baseline that does not require test-time fine-tuning for each subject and which sets quantitatively and in human trials a new SoTA.
arXiv Detail & Related papers (2023-12-11T04:47:39Z) - Personality Detection and Analysis using Twitter Data [7.584657555037871]
We release the largest automatically curated dataset for the research community.
This dataset has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task.
We show how our intriguing analysis results often follow natural intuition.
arXiv Detail & Related papers (2023-09-11T14:39:04Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Utilizing Social Media Attributes for Enhanced Keyword Detection: An
IDF-LDA Model Applied to Sina Weibo [0.0]
We propose a novel method to address the keyword detection problem in social media.
Our model combines the Inverse Document Frequency (IDF) and Latent Dirichlet Allocation (LDA) models to better cope with the distinct attributes of social media data.
arXiv Detail & Related papers (2023-05-30T08:35:39Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - Depression detection in social media posts using affective and social
norm features [84.12658971655253]
We propose a deep architecture for depression detection from social media posts.
We incorporate profanity and morality features of posts and words in our architecture using a late fusion scheme.
The inclusion of the proposed features yields state-of-the-art results in both settings.
arXiv Detail & Related papers (2023-03-24T21:26:27Z) - It's Just a Matter of Time: Detecting Depression with Time-Enriched
Multimodal Transformers [24.776445591293186]
We propose a flexible time-enriched multimodal transformer architecture for detecting depression from social media posts.
Our model operates directly at the user-level, and we enrich it with the relative time between posts by using time2vec positional embeddings.
We show that our method, using EmoBERTa and CLIP embeddings, surpasses other methods on two multimodal datasets.
arXiv Detail & Related papers (2023-01-13T09:40:19Z) - APES: Audiovisual Person Search in Untrimmed Video [87.4124877066541]
We present the Audiovisual Person Search dataset (APES)
APES contains over 1.9K identities labeled along 36 hours of video.
A key property of APES is that it includes dense temporal annotations that link faces to speech segments of the same identity.
arXiv Detail & Related papers (2021-06-03T08:16:42Z) - Personality Trait Detection Using Bagged SVM over BERT Word Embedding
Ensembles [10.425280599592865]
We present a novel deep learning-based approach for automated personality detection from text.
We leverage state of the art advances in natural language understanding, namely the BERT language model to extract contextualized word embeddings.
Our model outperforms the previous state of the art by 1.04% and, at the same time is significantly more computationally efficient to train.
arXiv Detail & Related papers (2020-10-03T09:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.