Predicting Declension Class from Form and Meaning
- URL: http://arxiv.org/abs/2005.00626v2
- Date: Thu, 28 May 2020 21:15:29 GMT
- Title: Predicting Declension Class from Form and Meaning
- Authors: Adina Williams, Tiago Pimentel, Arya D. McCarthy, Hagen Blix, Eleanor
Chodroff, Ryan Cotterell
- Abstract summary: Class membership is far from deterministic, but the phonological form of a noun and/or its meaning can often provide imperfect clues.
We operationalize this by measuring how much information, in bits, we can glean about declension class from knowing the form and/or meaning of nouns.
We find for two Indo-European languages (Czech and German) that form and meaning respectively share significant amounts of information with class.
- Score: 70.65971611552871
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The noun lexica of many natural languages are divided into several declension
classes with characteristic morphological properties. Class membership is far
from deterministic, but the phonological form of a noun and/or its meaning can
often provide imperfect clues. Here, we investigate the strength of those
clues. More specifically, we operationalize this by measuring how much
information, in bits, we can glean about declension class from knowing the form
and/or meaning of nouns. We know that form and meaning are often also
indicative of grammatical gender---which, as we quantitatively verify, can
itself share information with declension class---so we also control for gender.
We find for two Indo-European languages (Czech and German) that form and
meaning respectively share significant amounts of information with class (and
contribute additional information above and beyond gender). The three-way
interaction between class, form, and meaning (given gender) is also
significant. Our study is important for two reasons: First, we introduce a new
method that provides additional quantitative support for a classic linguistic
finding that form and meaning are relevant for the classification of nouns into
declensions. Secondly, we show not only that individual declensions classes
vary in the strength of their clues within a language, but also that these
variations themselves vary across languages.
Related papers
- What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages [51.0349882045866]
This paper investigates biases of Large Language Models (LLMs) through the lens of grammatical gender.
We prompt a model to describe nouns with adjectives in various languages, focusing specifically on languages with grammatical gender.
We find that a simple classifier can not only predict noun gender above chance but also exhibit cross-language transferability.
arXiv Detail & Related papers (2024-07-12T22:10:16Z) - The Causal Influence of Grammatical Gender on Distributional Semantics [87.8027818528463]
How much meaning influences gender assignment across languages is an active area of research in linguistics and cognitive science.
We offer a novel, causal graphical model that jointly represents the interactions between a noun's grammatical gender, its meaning, and adjective choice.
When we control for the meaning of the noun, the relationship between grammatical gender and adjective choice is near zero and insignificant.
arXiv Detail & Related papers (2023-11-30T13:58:13Z) - The Impact of Familiarity on Naming Variation: A Study on Object Naming
in Mandarin Chinese [4.6112416098164255]
We create a Language and Vision dataset for Mandarin Chinese that provides an average of 20 names for 1319 naturalistic images.
We investigate how familiarity with a given kind of object relates to the degree of naming variation it triggers across subjects.
arXiv Detail & Related papers (2023-11-16T20:13:24Z) - Quantifying the Roles of Visual, Linguistic, and Visual-Linguistic
Complexity in Verb Acquisition [8.183763443800348]
We employ visual and linguistic representations of words sourced from pre-trained artificial neural networks.
We find that the representation of verbs is generally more variable and less discriminable within domain than the representation of nouns.
Visual variability is the strongest factor that internally drives verb learning, followed by visual-linguistic alignment and linguistic variability.
arXiv Detail & Related papers (2023-04-05T15:08:21Z) - Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study.
We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z) - Investigating Cross-Linguistic Adjective Ordering Tendencies with a
Latent-Variable Model [66.84264870118723]
We present the first purely corpus-driven model of multi-lingual adjective ordering in the form of a latent-variable model.
We provide strong converging evidence for the existence of universal, cross-linguistic, hierarchical adjective ordering tendencies.
arXiv Detail & Related papers (2020-10-09T18:27:55Z) - An exploration of the encoding of grammatical gender in word embeddings [0.6461556265872973]
The study of grammatical gender based on word embeddings can give insight into discussions on how grammatical genders are determined.
It is found that there is an overlap in how grammatical gender is encoded in Swedish, Danish, and Dutch embeddings.
arXiv Detail & Related papers (2020-08-05T06:01:46Z) - On the Relationships Between the Grammatical Genders of Inanimate Nouns
and Their Co-Occurring Adjectives and Verbs [57.015586483981885]
We use large-scale corpora in six different gendered languages.
We find statistically significant relationships between the grammatical genders of inanimate nouns and the verbs that take those nouns as direct objects, indirect objects, and as subjects.
arXiv Detail & Related papers (2020-05-03T22:49:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.