Detecting value-expressive text posts in Russian social media
- URL: http://arxiv.org/abs/2312.08968v1
- Date: Thu, 14 Dec 2023 14:18:27 GMT
- Title: Detecting value-expressive text posts in Russian social media
- Authors: Maria Milkova, Maksim Rudnev, Lidia Okolskaya
- Abstract summary: We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte.
A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT.
ChatGPT was more consistent but struggled with spam detection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Basic values are concepts or beliefs which pertain to desirable end-states
and transcend specific situations. Studying personal values in social media can
illuminate how and why societal values evolve especially when the stimuli-based
methods, such as surveys, are inefficient, for instance, in hard-to-reach
populations. On the other hand, user-generated content is driven by the massive
use of stereotyped, culturally defined speech constructions rather than
authentic expressions of personal values. We aimed to find a model that can
accurately detect value-expressive posts in Russian social media VKontakte. A
training dataset of 5,035 posts was annotated by three experts, 304
crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate
agreement in categorizing posts. ChatGPT was more consistent but struggled with
spam detection. We applied an ensemble of human- and AI-assisted annotation
involving active learning approach, subsequently trained several LLMs and
selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2,
and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80).
This model provides a crucial step to a study of values within and between
Russian social media users.
Related papers
- A Unique Training Strategy to Enhance Language Models Capabilities for
Health Mention Detection from Social Media Content [6.053876125887214]
The extraction of health-related content from social media is useful for the development of diverse types of applications.
The primary reason for this shortfall lies in the non-standardized writing style commonly employed by social media users.
The key goal is achieved through the incorporation of random weighted perturbation and contrastive learning strategies.
A meta predictor is proposed that reaps the benefits of 5 different language models for discriminating posts of social media text into non-health and health-related classes.
arXiv Detail & Related papers (2023-10-29T16:08:33Z) - Decoding the Silent Majority: Inducing Belief Augmented Social Graph
with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics.
Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - What does ChatGPT return about human values? Exploring value bias in
ChatGPT using a descriptive value theory [0.0]
We test possible value biases in ChatGPT using a psychological value theory.
We found little evidence of explicit value bias.
We see some merging of socially oriented values, which may suggest that these values are less clearly differentiated at a linguistic level.
arXiv Detail & Related papers (2023-04-07T12:20:13Z) - NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations
On-the-Fly [61.77957329364812]
We introduce a framework for addressing the novel task of conversation-grounded multi-lingual, multi-cultural norm discovery.
NormSAGE elicits knowledge about norms through directed questions representing the norm discovery task and conversation context.
It further addresses the risk of language model hallucination with a self-verification mechanism ensuring that the norms discovered are correct.
arXiv Detail & Related papers (2022-10-16T18:30:05Z) - Enabling Classifiers to Make Judgements Explicitly Aligned with Human
Values [73.82043713141142]
Many NLP classification tasks, such as sexism/racism detection or toxicity detection, are based on human values.
We introduce a framework for value-aligned classification that performs prediction based on explicitly written human values in the command.
arXiv Detail & Related papers (2022-10-14T09:10:49Z) - Personal Attribute Prediction from Conversations [9.208339833472051]
We aim to predict the personal attribute value for the user, which is helpful for the enrichment of personal knowledge bases (PKBs)
We propose a framework based on the pre-trained language model with a noise-robust loss function to predict personal attributes from conversations without requiring any labeled utterances.
Our framework obtains the best performance compared with all the twelve baselines in terms of nDCG and MRR.
arXiv Detail & Related papers (2022-08-29T15:21:53Z) - ValueNet: A New Dataset for Human Value Driven Dialogue System [103.2044265617704]
We present a new large-scale human value dataset called ValueNet, which contains human attitudes on 21,374 text scenarios.
Comprehensive empirical results show that the learned value model could benefit a wide range of dialogue tasks.
ValueNet is the first large-scale text dataset for human value modeling.
arXiv Detail & Related papers (2021-12-12T23:02:52Z) - On Predicting Personal Values of Social Media Users using
Community-Specific Language Features and Personal Value Correlation [14.12186042953335]
This work focuses on analyzing Singapore users' personal values and developing effective models to predict their personal values using their Facebook data.
We incorporate the correlations among personal values into our proposed Stack Model consisting of a task-specific layer of base models and a cross-stitch layer model.
arXiv Detail & Related papers (2020-07-16T04:36:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.