Multilingual Dyadic Interaction Corpus NoXi+J: Toward Understanding Asian-European Non-verbal Cultural Characteristics and their Influences on Engagement
- URL: http://arxiv.org/abs/2409.13726v1
- Date: Mon, 9 Sep 2024 18:37:34 GMT
- Title: Multilingual Dyadic Interaction Corpus NoXi+J: Toward Understanding Asian-European Non-verbal Cultural Characteristics and their Influences on Engagement
- Authors: Marius Funk, Shogo Okada, Elisabeth André,
- Abstract summary: We conduct a multilingual computational analysis of non-verbal features and investigate their role in engagement prediction.
We extracted multimodal non-verbal features, including speech acoustics, facial expressions, backchanneling and gestures.
We analyzed the influence of cultural differences in the input features of LSTM models trained to predict engagement for five language datasets.
- Score: 6.984291346424792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-verbal behavior is a central challenge in understanding the dynamics of a conversation and the affective states between interlocutors arising from the interaction. Although psychological research has demonstrated that non-verbal behaviors vary across cultures, limited computational analysis has been conducted to clarify these differences and assess their impact on engagement recognition. To gain a greater understanding of engagement and non-verbal behaviors among a wide range of cultures and language spheres, in this study we conduct a multilingual computational analysis of non-verbal features and investigate their role in engagement and engagement prediction. To achieve this goal, we first expanded the NoXi dataset, which contains interaction data from participants living in France, Germany, and the United Kingdom, by collecting session data of dyadic conversations in Japanese and Chinese, resulting in the enhanced dataset NoXi+J. Next, we extracted multimodal non-verbal features, including speech acoustics, facial expressions, backchanneling and gestures, via various pattern recognition techniques and algorithms. Then, we conducted a statistical analysis of listening behaviors and backchannel patterns to identify culturally dependent and independent features in each language and common features among multiple languages. These features were also correlated with the engagement shown by the interlocutors. Finally, we analyzed the influence of cultural differences in the input features of LSTM models trained to predict engagement for five language datasets. A SHAP analysis combined with transfer learning confirmed a considerable correlation between the importance of input features for a language set and the significant cultural characteristics analyzed.
Related papers
- Toward Cultural Interpretability: A Linguistic Anthropological Framework for Describing and Evaluating Large Language Models (LLMs) [13.71024600466761]
This article proposes a new integration of linguistic anthropology and machine learning (ML)
We show the theoretical feasibility of a new, conjoint field of inquiry, cultural interpretability (CI)
CI emphasizes how the dynamic relationship between language and culture makes contextually sensitive, open-ended conversation possible.
arXiv Detail & Related papers (2024-11-07T22:01:50Z) - Investigating Cultural Alignment of Large Language Models [10.738300803676655]
We show that Large Language Models (LLMs) genuinely encapsulate the diverse knowledge adopted by different cultures.
We quantify cultural alignment by simulating sociological surveys, comparing model responses to those of actual survey participants as references.
We introduce Anthropological Prompting, a novel method leveraging anthropological reasoning to enhance cultural alignment.
arXiv Detail & Related papers (2024-02-20T18:47:28Z) - Language-based Valence and Arousal Expressions between the United States and China: a Cross-Cultural Examination [6.122854363918857]
This paper explores cultural differences in affective expressions by comparing Twitter/X (geolocated to the US) and Sina Weibo (in Mainland China)
Using the NRC-VAD lexicon to measure valence and arousal, we identify distinct patterns of emotional expression across both platforms.
We uncover significant cross-cultural differences in arousal, with US users displaying higher emotional intensity than Chinese users.
arXiv Detail & Related papers (2024-01-10T16:32:25Z) - Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
Figurative language permeates human communication, but is relatively understudied in NLP.
We create a dataset for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba.
Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region.
All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data.
arXiv Detail & Related papers (2023-05-25T15:30:31Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - Text as Causal Mediators: Research Design for Causal Estimates of
Differential Treatment of Social Groups via Language Aspects [7.175621752912443]
We propose a causal research design for observational (non-experimental) data to estimate the natural direct and indirect effects of social group signals on speakers' responses.
We illustrate the promises and challenges of this framework via a theoretical case study of the effect of an advocate's gender on interruptions from justices during U.S. Supreme Court oral arguments.
arXiv Detail & Related papers (2021-09-15T19:15:35Z) - A Massively Multilingual Analysis of Cross-linguality in Shared
Embedding Space [61.18554842370824]
In cross-lingual language models, representations for many different languages live in the same space.
We compute a task-based measure of cross-lingual alignment in the form of bitext retrieval performance.
We examine a range of linguistic, quasi-linguistic, and training-related features as potential predictors of these alignment metrics.
arXiv Detail & Related papers (2021-09-13T21:05:37Z) - Deception detection in text and its relation to the cultural dimension
of individualism/collectivism [6.17866386107486]
We investigate if differences in the usage of specific linguistic features of deception across cultures can be confirmed and attributed to norms in respect to the individualism/collectivism divide.
We create culture/language-aware classifiers by experimenting with a wide range of n-gram features based on phonology, morphology and syntax.
We conducted our experiments over 11 datasets from 5 languages i.e., English, Dutch, Russian, Spanish and Romanian, from six countries (US, Belgium, India, Russia, Mexico and Romania)
arXiv Detail & Related papers (2021-05-26T13:09:47Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.