Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts
- URL: http://arxiv.org/abs/2307.02640v1
- Date: Wed, 5 Jul 2023 20:16:20 GMT
- Title: Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts
- Authors: Alexandrea K. Ramnarine
- Abstract summary: The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
- Score: 91.3755431537592
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The massive collection of user posts across social media platforms is
primarily untapped for artificial intelligence (AI) use cases based on the
sheer volume and velocity of textual data. Natural language processing (NLP) is
a subfield of AI that leverages bodies of documents, known as corpora, to train
computers in human-like language understanding. Using a word ranking method,
term frequency-inverse document frequency (TF-IDF), to create features across
documents, it is possible to perform unsupervised analytics, machine learning
(ML) that can group the documents without a human manually labeling the data.
For large datasets with thousands of features, t-distributed stochastic
neighbor embedding (t-SNE), k-means clustering and Latent Dirichlet allocation
(LDA) are employed to learn top words and generate topics for a Reddit and
Twitter combined corpus. Using extremely simple deep learning models, this
study demonstrates that the applied results of unsupervised analysis allow a
computer to predict either negative, positive, or neutral user sentiment
towards plastic surgery based on a tweet or subreddit post with almost 90%
accuracy. Furthermore, the model is capable of achieving higher accuracy on the
unsupervised sentiment task than on a rudimentary supervised document
classification task. Therefore, unsupervised learning may be considered a
viable option in labeling social media documents for NLP tasks.
Related papers
- Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering [8.20929362102942]
Author profiling is the task of inferring characteristics about individuals by analyzing content they share.
We propose a new method for author profiling which aims at distinguishing relevant from irrelevant content first, followed by the actual user profiling only with relevant data.
We evaluate our method for Big Five personality trait prediction on two Twitter corpora.
arXiv Detail & Related papers (2024-09-06T08:43:10Z) - Bring Your Own Data! Self-Supervised Evaluation for Large Language
Models [52.15056231665816]
We propose a framework for self-supervised evaluation of Large Language Models (LLMs)
We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence.
We find strong correlations between self-supervised and human-supervised evaluations.
arXiv Detail & Related papers (2023-06-23T17:59:09Z) - Curating corpora with classifiers: A case study of clean energy
sentiment online [0.0]
Large-scale corpora of social media posts contain broad public opinion.
Surveys can be expensive to run and lag public opinion by days or weeks.
We propose a method for rapidly selecting the best corpus of relevant documents for analysis.
arXiv Detail & Related papers (2023-05-04T18:15:45Z) - A Multi-Grained Self-Interpretable Symbolic-Neural Model For
Single/Multi-Labeled Text Classification [29.075766631810595]
We propose a Symbolic-Neural model that can learn to explicitly predict class labels of text spans from a constituency tree.
As the structured language model learns to predict constituency trees in a self-supervised manner, only raw texts and sentence-level labels are required as training data.
Our experiments demonstrate that our approach could achieve good prediction accuracy in downstream tasks.
arXiv Detail & Related papers (2023-03-06T03:25:43Z) - Deep Sequence Models for Text Classification Tasks [0.007329200485567826]
Natural Language Processing (NLP) is equipping machines to understand human diverse and complicated languages.
Common text classification application includes information retrieval, modeling news topic, theme extraction, sentiment analysis, and spam detection.
Sequence models such as RNN, GRU, and LSTM is a breakthrough for tasks with long-range dependencies.
Results generated were excellent with most of the models performing within the range of 80% and 94%.
arXiv Detail & Related papers (2022-07-18T18:47:18Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - DeCLUTR: Deep Contrastive Learning for Unsupervised Textual
Representations [4.36561468436181]
We present DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations.
Our approach closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders.
Our code and pretrained models are publicly available and can be easily adapted to new domains or used to embed unseen text.
arXiv Detail & Related papers (2020-06-05T20:00:28Z) - ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine
Reading Comprehension [53.037401638264235]
We present an evaluation server, ORB, that reports performance on seven diverse reading comprehension datasets.
The evaluation server places no restrictions on how models are trained, so it is a suitable test bed for exploring training paradigms and representation learning.
arXiv Detail & Related papers (2019-12-29T07:27:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.