Predicting gender and age categories in English conversations using
lexical, non-lexical, and turn-taking features
- URL: http://arxiv.org/abs/2102.13355v1
- Date: Fri, 26 Feb 2021 08:23:08 GMT
- Title: Predicting gender and age categories in English conversations using
lexical, non-lexical, and turn-taking features
- Authors: Andreas Liesenfeld, G\'abor Parti, Yu-Yin Hsu, Chu-Ren Huang
- Abstract summary: We examine behavioural differences between speakers labelled for gender and age categories in the SpokenBNC.
We find that female speakers tend to produce more and slightly longer turns, while turns by male speakers feature a higher type-token ratio.
Across age groups, we observe, for instance, that swear words and laughter characterize young speakers' talk, while old speakers tend to produce more truncated words.
- Score: 3.2766169283137385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper examines gender and age salience and (stereo)typicality in British
English talk with the aim to predict gender and age categories based on
lexical, phrasal and turn-taking features. We examine the SpokenBNC, a corpus
of around 11.4 million words of British English conversations and identify
behavioural differences between speakers that are labelled for gender and age
categories. We explore differences in language use and turn-taking dynamics and
identify a range of characteristics that set the categories apart. We find that
female speakers tend to produce more and slightly longer turns, while turns by
male speakers feature a higher type-token ratio and a distinct range of minimal
particles such as "eh", "uh" and "em". Across age groups, we observe, for
instance, that swear words and laughter characterize young speakers' talk,
while old speakers tend to produce more truncated words. We then use the
observed characteristics to predict gender and age labels of speakers per
conversation and per turn as a classification task, showing that non-lexical
utterances such as minimal particles that are usually left out of dialog data
can contribute to setting the categories apart.
Related papers
- What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages [51.0349882045866]
This paper investigates biases of Large Language Models (LLMs) through the lens of grammatical gender.
We prompt a model to describe nouns with adjectives in various languages, focusing specifically on languages with grammatical gender.
We find that a simple classifier can not only predict noun gender above chance but also exhibit cross-language transferability.
arXiv Detail & Related papers (2024-07-12T22:10:16Z) - Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps [25.95711246919163]
Current automatic speech recognition (ASR) models are designed to be used across many languages and tasks without substantial changes.
Our study systematically evaluates the performance of two widely used multilingual ASR models on three datasets.
Our findings reveal clear gender disparities, with the advantaged group varying across languages and models.
arXiv Detail & Related papers (2024-02-28T00:24:29Z) - The Causal Influence of Grammatical Gender on Distributional Semantics [87.8027818528463]
How much meaning influences gender assignment across languages is an active area of research in linguistics and cognitive science.
We offer a novel, causal graphical model that jointly represents the interactions between a noun's grammatical gender, its meaning, and adjective choice.
When we control for the meaning of the noun, the relationship between grammatical gender and adjective choice is near zero and insignificant.
arXiv Detail & Related papers (2023-11-30T13:58:13Z) - How To Build Competitive Multi-gender Speech Translation Models For
Controlling Speaker Gender Translation [21.125217707038356]
When translating from notional gender languages into grammatical gender languages, the generated translation requires explicit gender assignments for various words, including those referring to the speaker.
To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender.
This paper aims to achieve the same results by integrating the speaker's gender metadata into a single "multi-gender" neural ST model, easier to maintain.
arXiv Detail & Related papers (2023-10-23T17:21:32Z) - Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech
Emotion Recognition [48.29355616574199]
We analyze the transferability of emotion recognition across three different languages--English, Mandarin Chinese, and Cantonese.
This study concludes that different language and age groups require specific speech features, thus making cross-lingual inference an unsuitable method.
arXiv Detail & Related papers (2023-06-26T08:48:08Z) - Analysis of Male and Female Speakers' Word Choices in Public Speeches [0.0]
We compared the word choices of male and female presenters in public addresses such as TED lectures.
Based on our data, we determined that male speakers use specific types of linguistic, psychological, cognitive, and social words in considerably greater frequency than female speakers.
arXiv Detail & Related papers (2022-11-11T17:30:28Z) - Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study.
We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z) - Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias
in Speech Translation [20.39599469927542]
Gender bias is largely recognized as a problematic phenomenon affecting language technologies.
Most of current evaluation practices adopt a word-level focus on a narrow set of occupational nouns under synthetic conditions.
Such protocols overlook key features of grammatical gender languages, which are characterized by morphosyntactic chains of gender agreement.
arXiv Detail & Related papers (2022-03-18T11:14:16Z) - Perception Point: Identifying Critical Learning Periods in Speech for
Bilingual Networks [58.24134321728942]
We compare and identify cognitive aspects on deep neural-based visual lip-reading models.
We observe a strong correlation between these theories in cognitive psychology and our unique modeling.
arXiv Detail & Related papers (2021-10-13T05:30:50Z) - Pick a Fight or Bite your Tongue: Investigation of Gender Differences in
Idiomatic Language Usage [9.892162266128306]
We compile a novel, large and diverse corpus of spontaneous linguistic productions annotated with speakers' gender.
We perform a first large-scale empirical study of distinctions in the usage of textitfigurative language between male and female authors.
arXiv Detail & Related papers (2020-10-31T18:44:07Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.