No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models
- URL: http://arxiv.org/abs/2405.13777v3
- Date: Wed, 23 Oct 2024 21:25:39 GMT
- Title: No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models
- Authors: Angéline Pouget, Lucas Beyer, Emanuele Bugliarello, Xiao Wang, Andreas Peter Steiner, Xiaohua Zhai, Ibrahim Alabdulmohsin,
- Abstract summary: We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs)
Our work underscores the value of using diverse data to create more inclusive multimodal systems.
- Score: 38.932610459192105
- License:
- Abstract: We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs). Using a broad range of benchmark datasets and evaluation metrics, we bring to attention several important findings. First, the common filtering of training data to English image-text pairs disadvantages communities of lower socioeconomic status and negatively impacts cultural understanding. Notably, this performance gap is not captured by - and even at odds with - the currently popular evaluation metrics derived from the Western-centric ImageNet and COCO datasets. Second, pretraining with global, unfiltered data before fine-tuning on English content can improve cultural understanding without sacrificing performance on said popular benchmarks. Third, we introduce the task of geo-localization as a novel evaluation metric to assess cultural diversity in VLMs. Our work underscores the value of using diverse data to create more inclusive multimodal systems and lays the groundwork for developing VLMs that better represent global perspectives.
Related papers
- Self-Alignment: Improving Alignment of Cultural Values in LLMs via In-Context Learning [13.034603322224548]
We present a simple and inexpensive method that uses a combination of in-context learning (ICL) and human survey data.
We show that our method could prove useful in test languages other than English and can improve alignment to the cultural values that correspond to a range of culturally diverse countries.
arXiv Detail & Related papers (2024-08-29T12:18:04Z) - Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas [4.0937229334408185]
We employ GPT-3.5 to reproduce reactions to persuasive news articles from 7,286 participants from 15 countries.
Our analysis shows that specifying a person's country of residence improves GPT-3.5's alignment with their responses.
In contrast, using native language prompting introduces shifts that significantly reduce overall alignment.
arXiv Detail & Related papers (2024-08-13T14:32:43Z) - From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models [10.121734731147376]
Vision-language models' performance remains suboptimal on images from non-western cultures.
Various benchmarks have been proposed to test models' cultural inclusivity, but they have limited coverage of cultures.
We introduce the GlobalRG benchmark, comprising two challenging tasks: retrieval across universals and cultural visual grounding.
arXiv Detail & Related papers (2024-06-28T23:28:28Z) - Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration [31.684544472009918]
We propose a semi-grained pipeline for constructing cultural VLM benchmarks.
VLM models generate questions based on guidelines, human-annotated examples, and image-wise relevant knowledge.
This pipeline is demonstrated through a specific application: creating a dataset tailored to Korean culture, dubbed K-Viscuit.
arXiv Detail & Related papers (2024-06-24T09:18:15Z) - See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding [78.88461026069862]
Vision-language models (VLMs) can respond to queries about images in many languages.
We present a novel investigation that demonstrates and localizes Western bias in image understanding.
arXiv Detail & Related papers (2024-06-17T15:49:51Z) - Extrinsic Evaluation of Cultural Competence in Large Language Models [53.626808086522985]
We focus on extrinsic evaluation of cultural competence in two text generation tasks.
We evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts.
We find weak correlations between text similarity of outputs for different countries and the cultural values of these countries.
arXiv Detail & Related papers (2024-06-17T14:03:27Z) - Multilingual Diversity Improves Vision-Language Representations [66.41030381363244]
Pre-training on this dataset outperforms using English-only or English-dominated datasets on ImageNet.
On a geographically diverse task like GeoDE, we also observe improvements across all regions, with the biggest gain coming from Africa.
arXiv Detail & Related papers (2024-05-27T08:08:51Z) - CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs)
We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z) - D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation [5.9053106775634685]
We introduce the dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators.
The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity.
Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values.
arXiv Detail & Related papers (2024-04-16T19:12:03Z) - CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge [69.82940934994333]
We introduce CulturalTeaming, an interactive red-teaming system that leverages human-AI collaboration to build challenging evaluation dataset.
Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions.
CULTURALBENCH-V0.1 is a compact yet high-quality evaluation dataset with users' red-teaming attempts.
arXiv Detail & Related papers (2024-04-10T00:25:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.