Involvement drives complexity of language in online debates
- URL: http://arxiv.org/abs/2506.22098v1
- Date: Fri, 27 Jun 2025 10:27:54 GMT
- Title: Involvement drives complexity of language in online debates
- Authors: Eleonora Amadori, Daniele Cirulli, Edoardo Di Martino, Jacopo Nudo, Maria Sahakyan, Emanuele Sangiorgio, Arnaldo Santoro, Simon Zollo, Alessandro Galeazzi, Niccolò Di Marco,
- Abstract summary: We examine the linguistic complexity of content produced by influential users on Twitter across three globally significant and contested topics: COVID-19, COP26, and the Russia-Ukraine war.<n>Our analysis reveals significant differences between individuals and organizations, between profiles with sided versus moderate political views, and between those associated with higher versus lower reliability scores.<n>Our findings offer new insights into the sociolinguistic dynamics of digital platforms and contribute to a deeper understanding of how language reflects ideological and social structures in online spaces.
- Score: 32.73124984242397
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language is a fundamental aspect of human societies, continuously evolving in response to various stimuli, including societal changes and intercultural interactions. Technological advancements have profoundly transformed communication, with social media emerging as a pivotal force that merges entertainment-driven content with complex social dynamics. As these platforms reshape public discourse, analyzing the linguistic features of user-generated content is essential to understanding their broader societal impact. In this paper, we examine the linguistic complexity of content produced by influential users on Twitter across three globally significant and contested topics: COVID-19, COP26, and the Russia-Ukraine war. By combining multiple measures of textual complexity, we assess how language use varies along four key dimensions: account type, political leaning, content reliability, and sentiment. Our analysis reveals significant differences across all four axes, including variations in language complexity between individuals and organizations, between profiles with sided versus moderate political views, and between those associated with higher versus lower reliability scores. Additionally, profiles producing more negative and offensive content tend to use more complex language, with users sharing similar political stances and reliability levels converging toward a common jargon. Our findings offer new insights into the sociolinguistic dynamics of digital platforms and contribute to a deeper understanding of how language reflects ideological and social structures in online spaces.
Related papers
- The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models [7.811355338367627]
We show that the widespread adoption of large language models (LLMs) as writing assistants is linked to notable declines in linguistic diversity.<n>We show that while the core content of texts is retained when LLMs polish and rewrite texts, not only do they homogenize writing styles, but they also alter stylistic elements in a way that selectively amplifies certain dominant characteristics or biases while suppressing others.
arXiv Detail & Related papers (2025-02-16T20:51:07Z) - The Evolution of Language in Social Media Comments [37.69303106863453]
This study investigates the linguistic characteristics of user comments over 34 years, focusing on their complexity and temporal shifts.
We utilize a dataset of approximately 300 million English comments from eight diverse platforms and topics.
Our findings reveal consistent patterns of complexity across social media platforms and topics, characterized by a nearly universal reduction in text length, diminished lexical richness, but decreased repetitiveness.
arXiv Detail & Related papers (2024-06-17T12:03:30Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - GPT-4V(ision) as A Social Media Analysis Engine [77.23394183063238]
This paper explores GPT-4V's capabilities for social multimedia analysis.
We select five representative tasks, including sentiment analysis, hate speech detection, fake news identification, demographic inference, and political ideology detection.
GPT-4V demonstrates remarkable efficacy in these tasks, showcasing strengths such as joint understanding of image-text pairs, contextual and cultural awareness, and extensive commonsense knowledge.
arXiv Detail & Related papers (2023-11-13T18:36:50Z) - AI Chat Assistants can Improve Conversations about Divisive Topics [3.8583005413310625]
We present results of a large-scale experiment that demonstrates how online conversations can be improved with artificial intelligence tools.
We employ a large language model to make real-time, evidence-based recommendations intended to improve participants' perception of feeling understood in conversations.
We find that these interventions improve the reported quality of the conversation, reduce political divisiveness, and improve the tone, without systematically changing the content of the conversation or moving people's policy attitudes.
arXiv Detail & Related papers (2023-02-14T06:42:09Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Seamlessly Integrating Factual Information and Social Content with
Persuasive Dialogue [48.75221685739286]
We present a novel modular dialogue system framework that seamlessly integrates factual information and social content into persuasive dialogue.
Our framework is generalizable to any dialogue tasks that have mixed social and task contents.
arXiv Detail & Related papers (2022-03-15T05:38:34Z) - Mental Disorders on Online Social Media Through the Lens of Language and
Behaviour: Analysis and Visualisation [7.133136338850781]
We study the factors that characterise and differentiate social media users affected by mental disorders.
Our findings reveal significant differences on the use of function words, such as adverbs and verb tense, and topic-specific vocabulary.
We find evidence suggesting that language use on micro-blogging platforms is less distinguishable for users who have a mental disorder.
arXiv Detail & Related papers (2022-02-07T15:29:01Z) - Characterizing English Variation across Social Media Communities with
BERT [9.98785450861229]
We analyze two months of English comments in 474 Reddit communities.
The specificity of different sense clusters to a community, combined with the specificity of a community's unique word types, is used to identify cases where a social group's language deviates from the norm.
We find that communities with highly distinctive language are medium-sized, and their loyal and highly engaged users interact in dense networks.
arXiv Detail & Related papers (2021-02-12T23:50:57Z) - Experience Grounds Language [185.73483760454454]
Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.
Despite the incredible effectiveness of language processing models to tackle tasks after being trained on text alone, successful linguistic communication relies on a shared experience of the world.
arXiv Detail & Related papers (2020-04-21T16:56:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.