Related papers: Cross-Cultural Validation of Partner Models for Voice User Interfaces

Cross-Cultural Validation of Partner Models for Voice User Interfaces

URL: http://arxiv.org/abs/2405.09002v1
Date: Wed, 15 May 2024 00:00:36 GMT
Title: Cross-Cultural Validation of Partner Models for Voice User Interfaces
Authors: Katie Seaborn, Iona Gessinger, Suzuka Yoshida, Benjamin R. Cowan, Philip R. Doyle,
Abstract summary: We translate, localize, and evaluate the Partner Modelling Questionnaire (PMQ) for non-English speaking Western (German) and East Asian cohorts. We find that the scale produces equivalent levels of goodness-to-fit for both our German and Japanese translations, confirming its cross-cultural validity. We discuss how our translations can open up critical research on cultural similarities and differences in partner model use and design.
Score: 30.810951137239716
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Recent research has begun to assess people's perceptions of voice user interfaces (VUIs) as dialogue partners, termed partner models. Current self-report measures are only available in English, limiting research to English-speaking users. To improve the diversity of user samples and contexts that inform partner modelling research, we translated, localized, and evaluated the Partner Modelling Questionnaire (PMQ) for non-English speaking Western (German, n=185) and East Asian (Japanese, n=198) cohorts where VUI use is popular. Through confirmatory factor analysis (CFA), we find that the scale produces equivalent levels of goodness-to-fit for both our German and Japanese translations, confirming its cross-cultural validity. Still, the structure of the communicative flexibility factor did not replicate directly across Western and East Asian cohorts. We discuss how our translations can open up critical research on cultural similarities and differences in partner model use and design, whilst highlighting the challenges for ensuring accurate translation across cultural contexts.

Related papers

Disentangling Language and Culture for Evaluating Multilingual Large Language Models [48.06219053598005]
This paper introduces a Dual Evaluation Framework to comprehensively assess the multilingual capabilities of LLMs.<n>By decomposing the evaluation along the dimensions of linguistic medium and cultural context, this framework enables a nuanced analysis of LLMs' ability to process questions cross-lingually.
arXiv Detail & Related papers (2025-05-30T14:25:45Z)
DaKultur: Evaluating the Cultural Awareness of Language Models for Danish with Native Speakers [17.355452637877402]
We conduct the first cultural evaluation study for the mid-resource language of Danish, in which native speakers prompt different models to solve tasks requiring cultural awareness. Our analysis of the resulting 1,038 interactions from 63 demographically diverse participants highlights open challenges to cultural adaptation.
arXiv Detail & Related papers (2025-04-03T08:52:42Z)
Filipino Benchmarks for Measuring Sexist and Homophobic Bias in Multilingual Language Models from Southeast Asia [0.3376269351435396]
We introduce benchmarks that assess both sexist and anti-queer biases in pretrained language models handling texts in Filipino. The benchmarks consist of 7,074 new challenge pairs resulting from our cultural adaptation of English bias evaluation datasets. We find that for multilingual models, the extent of bias learned for a particular language is influenced by how much pretraining data in that language a model was exposed to.
arXiv Detail & Related papers (2024-12-10T08:31:52Z)
KULTURE Bench: A Benchmark for Assessing Language Model in Korean Cultural Context [5.693660906643207]
We introduce KULTURE Bench, an evaluation framework specifically designed for Korean culture. It is designed to assess language models' cultural comprehension and reasoning capabilities at the word, sentence, and paragraph levels. The results show that there is still significant room for improvement in the models' understanding of texts related to the deeper aspects of Korean culture.
arXiv Detail & Related papers (2024-12-10T07:20:51Z)
Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas [4.0937229334408185]
We employ GPT-3.5 to reproduce reactions to persuasive news articles from 7,286 participants from 15 countries. Our analysis shows that specifying a person's country of residence improves GPT-3.5's alignment with their responses. In contrast, using native language prompting introduces shifts that significantly reduce overall alignment.
arXiv Detail & Related papers (2024-08-13T14:32:43Z)
Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance [6.907734681124986]
This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada.
arXiv Detail & Related papers (2024-06-17T01:54:27Z)
The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance. Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes. We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z)
Zero-shot Cross-lingual Stance Detection via Adversarial Language Adaptation [7.242609314791262]
This paper introduces a novel approach to zero-shot cross-lingual stance detection, Multilingual Translation-Augmented BERT (MTAB) Our technique employs translation augmentation to improve zero-shot performance and pairs it with adversarial learning to further boost model efficacy. We demonstrate the effectiveness of our proposed approach, showcasing improved results in comparison to a strong baseline model as well as ablated versions of our model.
arXiv Detail & Related papers (2024-04-22T16:56:43Z)
Investigating Cultural Alignment of Large Language Models [10.738300803676655]
We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures. We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references. We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
arXiv Detail & Related papers (2024-02-20T18:47:28Z)
Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z)
Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks. Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena. For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z)
Cross-Lingual NER for Financial Transaction Data in Low-Resource Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data. We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information. With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z)
Modeling Bilingual Conversational Characteristics for Neural Chat Translation [24.94474722693084]
We aim to promote the translation quality of conversational text by modeling the above properties. We evaluate our approach on the benchmark dataset BConTrasT (English-German) and a self-collected bilingual dialogue corpus, named BMELD (English-Chinese) Our approach notably boosts the performance over strong baselines by a large margin and significantly surpasses some state-of-the-art context-aware NMT models in terms of BLEU and TER.
arXiv Detail & Related papers (2021-07-23T12:23:34Z)
Unsupervised Cross-lingual Representation Learning for Speech Recognition [63.85924123692923]
XLSR learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages. We build on wav2vec 2.0 which is trained by solving a contrastive task over masked latent speech representations. Experiments show that cross-lingual pretraining significantly outperforms monolingual pretraining.
arXiv Detail & Related papers (2020-06-24T18:25:05Z)
XPersona: Evaluating Multilingual Personalized Chatbot [76.00426517401894]
We propose a multi-lingual extension of Persona-Chat, namely XPersona. Our dataset includes persona conversations in six different languages other than English for building and evaluating multilingual personalized agents.
arXiv Detail & Related papers (2020-03-17T07:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.