Cultural Compass: Predicting Transfer Learning Success in Offensive
Language Detection with Cultural Features
- URL: http://arxiv.org/abs/2310.06458v1
- Date: Tue, 10 Oct 2023 09:29:38 GMT
- Title: Cultural Compass: Predicting Transfer Learning Success in Offensive
Language Detection with Cultural Features
- Authors: Li Zhou, Antonia Karamolegkou, Wenyu Chen, Daniel Hershcovich
- Abstract summary: Our study delves into the intersection of cultural features and transfer learning effectiveness.
Based on these results, we advocate for the integration of cultural information into datasets.
Our research signifies a step forward in the quest for more inclusive, culturally sensitive language technologies.
- Score: 21.54368550883955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing ubiquity of language technology necessitates a shift towards
considering cultural diversity in the machine learning realm, particularly for
subjective tasks that rely heavily on cultural nuances, such as Offensive
Language Detection (OLD). Current understanding underscores that these tasks
are substantially influenced by cultural values, however, a notable gap exists
in determining if cultural features can accurately predict the success of
cross-cultural transfer learning for such subjective tasks. Addressing this,
our study delves into the intersection of cultural features and transfer
learning effectiveness. The findings reveal that cultural value surveys indeed
possess a predictive power for cross-cultural transfer learning success in OLD
tasks and that it can be further improved using offensive word distance. Based
on these results, we advocate for the integration of cultural information into
datasets. Additionally, we recommend leveraging data sources rich in cultural
information, such as surveys, to enhance cultural adaptability. Our research
signifies a step forward in the quest for more inclusive, culturally sensitive
language technologies.
Related papers
- Translating Across Cultures: LLMs for Intralingual Cultural Adaptation [12.5954253354303]
We define the task of cultural adaptation and create an evaluation framework to benchmark different models for this task.
We analyze possible issues with automatic adaptation including cultural biases and stereotypes.
arXiv Detail & Related papers (2024-06-20T17:06:58Z) - Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - What You Use is What You Get: Unforced Errors in Studying Cultural Aspects in Agile Software Development [2.9418191027447906]
Investigating the influence of cultural characteristics is challenging due to the multi-faceted concept of culture.
Cultural and social aspects are of high importance for their successful use in practice.
arXiv Detail & Related papers (2024-04-25T20:08:37Z) - CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies [53.2331634010413]
CultureBank is a knowledge base built upon users' self-narratives.
It contains 12K cultural descriptors sourced from TikTok and 11K from Reddit.
We offer recommendations for future culturally aware language technologies.
arXiv Detail & Related papers (2024-04-23T17:16:08Z) - CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting [68.37589899302161]
We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations.
We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
arXiv Detail & Related papers (2024-04-16T00:50:43Z) - Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition.
Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages.
Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z) - Benchmarking LLM-based Machine Translation on Cultural Awareness [53.83912076814508]
Translating cultural-specific content is crucial for effective cross-cultural communication.
Recent advancements in in-context learning utilize lightweight prompts to guide large language models (LLMs) in machine translation tasks.
We introduce a new data curation pipeline to construct a culturally relevant parallel corpus.
arXiv Detail & Related papers (2023-05-23T17:56:33Z) - Cross-Cultural Transfer Learning for Chinese Offensive Language
Detection [9.341003339029221]
We aim to investigate the impact of transfer learning using offensive language detection data from different cultural backgrounds.
We find that culture-specific biases in what is considered offensive negatively impact the transferability of language models.
In a few-shot learning scenario, however, our study shows promising prospects for non-English offensive language detection with limited resources.
arXiv Detail & Related papers (2023-03-31T09:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.