Related papers: Identifying Distributional Perspective Differences from Colingual Groups

Identifying Distributional Perspective Differences from Colingual Groups

URL: http://arxiv.org/abs/2004.04938v2
Date: Mon, 12 Apr 2021 19:11:33 GMT
Title: Identifying Distributional Perspective Differences from Colingual Groups
Authors: Yufei Tian, Tuhin Chakrabarty, Fred Morstatter and Nanyun Peng
Abstract summary: A lack of mutual understanding among different groups about their perspectives on specific values or events may lead to uninformed decisions or biased opinions. We study colingual groups and use language corpora as a proxy to identify their distributional perspectives. We present a novel computational approach to learn shared understandings, and benchmark our method by building culturally-aware models for the English, Chinese, and Japanese languages.
Score: 41.58939666949895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Perspective differences exist among different cultures or languages. A lack of mutual understanding among different groups about their perspectives on specific values or events may lead to uninformed decisions or biased opinions. Automatically understanding the group perspectives can provide essential background for many downstream applications of natural language processing techniques. In this paper, we study colingual groups and use language corpora as a proxy to identify their distributional perspectives. We present a novel computational approach to learn shared understandings, and benchmark our method by building culturally-aware models for the English, Chinese, and Japanese languages. On a held out set of diverse topics including marriage, corruption, democracy, our model achieves high correlation with human judgements regarding intra-group values and inter-group differences.

Related papers

Contrasting Cognitive Styles in Vision-Language Models: Holistic Attention in Japanese Versus Analytical Focus in English [4.8310710966636545]
We investigate whether Vision-Language Models (VLMs) trained predominantly on different languages, specifically Japanese and English, exhibit similar culturally grounded attentional patterns.<n>Our findings suggest that VLMs not only internalize the structural properties of language but also reproduce cultural behaviors embedded in the training data, indicating that cultural cognition may implicitly shape model outputs.
arXiv Detail & Related papers (2025-07-01T11:56:45Z)
The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models [7.811355338367627]
We show that the widespread adoption of large language models (LLMs) as writing assistants is linked to notable declines in linguistic diversity. We show that while the core content of texts is retained when LLMs polish and rewrite texts, not only do they homogenize writing styles, but they also alter stylistic elements in a way that selectively amplifies certain dominant characteristics or biases while suppressing others.
arXiv Detail & Related papers (2025-02-16T20:51:07Z)
Toward Cultural Interpretability: A Linguistic Anthropological Framework for Describing and Evaluating Large Language Models (LLMs) [13.71024600466761]
This article proposes a new integration of linguistic anthropology and machine learning (ML) We show the theoretical feasibility of a new, conjoint field of inquiry, cultural interpretability (CI) CI emphasizes how the dynamic relationship between language and culture makes contextually sensitive, open-ended conversation possible.
arXiv Detail & Related papers (2024-11-07T22:01:50Z)
Speech Analysis of Language Varieties in Italy [18.464078978885812]
We focus on automatically identifying the geographic region of origin of speech samples drawn from Italy's diverse language varieties. We also seek to uncover new insights into the relationships among these diverse yet closely related varieties.
arXiv Detail & Related papers (2024-06-22T14:19:51Z)
Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models. We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z)
Investigating Cultural Alignment of Large Language Models [10.738300803676655]
We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures. We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references. We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
arXiv Detail & Related papers (2024-02-20T18:47:28Z)
Comparing Biases and the Impact of Multilingual Training across Multiple Languages [70.84047257764405]
We present a bias analysis across Italian, Chinese, English, Hebrew, and Spanish on the downstream sentiment analysis task. We adapt existing sentiment bias templates in English to Italian, Chinese, Hebrew, and Spanish for four attributes: race, religion, nationality, and gender. Our results reveal similarities in bias expression such as favoritism of groups that are dominant in each language's culture.
arXiv Detail & Related papers (2023-05-18T18:15:07Z)
Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study. We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z)
Assessing Multilingual Fairness in Pre-trained Multimodal Representations [8.730027941735804]
We argue that pre-trained vision-and-language representations are individually fair across languages but not guaranteed to group fairness. We conduct experiments to explore the prevalent group disparity across languages and protected groups including race, gender and age.
arXiv Detail & Related papers (2021-06-12T03:57:05Z)
Deception detection in text and its relation to the cultural dimension of individualism/collectivism [6.17866386107486]
We investigate if differences in the usage of specific linguistic features of deception across cultures can be confirmed and attributed to norms in respect to the individualism/collectivism divide. We create culture/language-aware classifiers by experimenting with a wide range of n-gram features based on phonology, morphology and syntax. We conducted our experiments over 11 datasets from 5 languages i.e., English, Dutch, Russian, Spanish and Romanian, from six countries (US, Belgium, India, Russia, Mexico and Romania)
arXiv Detail & Related papers (2021-05-26T13:09:47Z)
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications. We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source. We observe that our representations embed typology and strengthen correlations with language relationships. We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.