American cultural regions mapped through the lexical analysis of social
media
- URL: http://arxiv.org/abs/2208.07649v2
- Date: Tue, 18 Apr 2023 15:37:24 GMT
- Title: American cultural regions mapped through the lexical analysis of social
media
- Authors: Thomas Louf, Bruno Gon\c{c}alves, Jose J. Ramasco, David Sanchez, Jack
Grieve
- Abstract summary: This work takes a crucial step in this direction by introducing a method to infer cultural regions based on the automatic analysis of large datasets from microblogging posts.
Specifically, regional variations in written discourse are measured in American social media.
Through a hierarchical clustering of the data in this lower-dimensional space, this method yields clear cultural areas and the topics of discussion that define them.
- Score: 1.8199326045904993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cultural areas represent a useful concept that cross-fertilizes diverse
fields in social sciences. Knowledge of how humans organize and relate their
ideas and behavior within a society helps to understand their actions and
attitudes towards different issues. However, the selection of common traits
that shape a cultural area is somewhat arbitrary. What is needed is a method
that can leverage the massive amounts of data coming online, especially through
social media, to identify cultural regions without ad-hoc assumptions, biases
or prejudices. This work takes a crucial step in this direction by introducing
a method to infer cultural regions based on the automatic analysis of large
datasets from microblogging posts. The approach presented here is based on the
principle that cultural affiliation can be inferred from the topics that people
discuss among themselves. Specifically, regional variations in written
discourse are measured in American social media. From the frequency
distributions of content words in geotagged Tweets, the regional hotspots of
words' usage are found, and from there, principal components of regional
variation are derived. Through a hierarchical clustering of the data in this
lower-dimensional space, this method yields clear cultural areas and the topics
of discussion that define them. It uncovers a manifest North-South separation,
which is primarily influenced by the African American culture, and further
contiguous (East-West) and non-contiguous divisions that provide a
comprehensive picture of today's cultural areas in the US.
Related papers
- Building Knowledge-Guided Lexica to Model Cultural Variation [10.156458483180842]
Measuring regional cultural variation can illuminate how and why people think and behave differently.
We introduce a new research problem for the NLP community: How do we measure variation in cultural constructs across regions using language?
We provide a scalable solution: building knowledge-guided lexica to model cultural variation.
arXiv Detail & Related papers (2024-06-17T15:05:43Z) - Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - What You Use is What You Get: Unforced Errors in Studying Cultural Aspects in Agile Software Development [2.9418191027447906]
Investigating the influence of cultural characteristics is challenging due to the multi-faceted concept of culture.
Cultural and social aspects are of high importance for their successful use in practice.
arXiv Detail & Related papers (2024-04-25T20:08:37Z) - The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models [67.38144169029617]
We introduce PRISM, a new dataset which maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries.
PRISM contributes (i) wide geographic and demographic participation in human feedback data; (ii) two census-representative samples for understanding collective welfare (UK and US); and (iii) individualised feedback where every rating is linked to a detailed participant profile.
arXiv Detail & Related papers (2024-04-24T17:51:36Z) - CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting [73.94059188347582]
We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations.
We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
arXiv Detail & Related papers (2024-04-16T00:50:43Z) - Investigating Cultural Alignment of Large Language Models [10.738300803676655]
We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures.
We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references.
We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
arXiv Detail & Related papers (2024-02-20T18:47:28Z) - Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition.
Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages.
Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - Assessing Cross-Cultural Alignment between ChatGPT and Human Societies:
An Empirical Study [9.919972416590124]
ChatGPT has garnered widespread recognition for its exceptional ability to generate human-like responses in dialogue.
We investigate the underlying cultural background of ChatGPT by analyzing its responses to questions designed to quantify human cultural differences.
arXiv Detail & Related papers (2023-03-30T15:43:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.