Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models
- URL: http://arxiv.org/abs/2407.02067v1
- Date: Tue, 2 Jul 2024 08:55:41 GMT
- Title: Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models
- Authors: Anjishnu Mukherjee, Ziwei Zhu, Antonios Anastasopoulos,
- Abstract summary: We first introduce Dalle Street, a large-scale dataset containing 9,935 images of 67 countries and 10 concept classes.
Next, we assess models' deeper culture understanding by an artifact extraction task and identify over 18,000 artifacts associated with different countries.
Finally, we propose a highly composable pipeline, CultureAdapt, to adapt images from culture to culture.
- Score: 22.92083941222383
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we present a comprehensive three-phase study to examine (1) the effectiveness of large multimodal models (LMMs) in recognizing cultural contexts; (2) the accuracy of their representations of diverse cultures; and (3) their ability to adapt content across cultural boundaries. We first introduce Dalle Street, a large-scale dataset generated by DALL-E 3 and validated by humans, containing 9,935 images of 67 countries and 10 concept classes. We reveal disparities in cultural understanding at the sub-region level with both open-weight (LLaVA) and closed-source (GPT-4V) models on Dalle Street and other existing benchmarks. Next, we assess models' deeper culture understanding by an artifact extraction task and identify over 18,000 artifacts associated with different countries. Finally, we propose a highly composable pipeline, CultureAdapt, to adapt images from culture to culture. Our findings reveal a nuanced picture of the cultural competence of LMMs, highlighting the need to develop culture-aware systems. Dataset and code are available at https://github.com/iamshnoo/crossroads
Related papers
- Beyond Aesthetics: Cultural Competence in Text-to-Image Models [34.98692829036475]
CUBE is a first-of-its-kind benchmark to evaluate cultural competence of Text-to-Image models.
CUBE covers cultural artifacts associated with 8 countries across different geo-cultural regions.
CUBE-CSpace is a larger dataset of cultural artifacts that serves as grounding to evaluate cultural diversity.
arXiv Detail & Related papers (2024-07-09T13:50:43Z) - Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - CulturePark: Boosting Cross-cultural Understanding in Large Language Models [63.452948673344395]
This paper introduces CulturePark, an LLM-powered multi-agent communication framework for cultural data collection.
It generates high-quality cross-cultural dialogues encapsulating human beliefs, norms, and customs.
We evaluate these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
arXiv Detail & Related papers (2024-05-24T01:49:02Z) - CRAFT: Extracting and Tuning Cultural Instructions from the Wild [38.255242754975654]
This paper introduces a novel pipeline for extracting high-quality, culturally-related instruction tuning datasets from vast unstructured corpora.
We utilize a self-instruction generation pipeline to identify cultural concepts and trigger instruction.
We conduct experiments across three regions: Singapore, the Philippines, and the United States, achieving performance improvement of up to 6%.
arXiv Detail & Related papers (2024-05-06T03:21:55Z) - CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting [68.37589899302161]
We uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations.
We discover that culture-conditioned generation consist of linguistic "markers" that distinguish marginalized cultures apart from default cultures.
arXiv Detail & Related papers (2024-04-16T00:50:43Z) - Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking [48.21982147529661]
This paper introduces a novel approach for massively multicultural knowledge acquisition.
Our method strategically navigates from densely informative Wikipedia documents on cultural topics to an extensive network of linked pages.
Our work marks an important step towards deeper understanding and bridging the gaps of cultural disparities in AI.
arXiv Detail & Related papers (2024-02-14T18:16:54Z) - Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
Large Language Models [89.94270049334479]
This paper identifies a cultural dominance issue within large language models (LLMs)
LLMs often provide inappropriate English-culture-related answers that are not relevant to the expected culture when users ask in non-English languages.
arXiv Detail & Related papers (2023-10-19T05:38:23Z) - On the Cultural Gap in Text-to-Image Generation [75.69755281031951]
One challenge in text-to-image (T2I) generation is the inadvertent reflection of culture gaps present in the training data.
There is no benchmark to systematically evaluate a T2I model's ability to generate cross-cultural images.
We propose a Challenging Cross-Cultural (C3) benchmark with comprehensive evaluation criteria, which can assess how well-suited a model is to a target culture.
arXiv Detail & Related papers (2023-07-06T13:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.