A bounded rationality account of dependency length minimization in Hindi
- URL: http://arxiv.org/abs/2304.11410v1
- Date: Sat, 22 Apr 2023 13:53:50 GMT
- Title: A bounded rationality account of dependency length minimization in Hindi
- Authors: Sidharth Ranjan and Titus von der Malsburg
- Abstract summary: The principle of DEPENDENCY LENGTH MINIMIZATION is thought to shape the structure of human languages for effective communication.
Preverbally, the placement of long-before-short constituents and postverbally, short-before-long constituents are known to minimize overall dependency length of a sentence.
In this study, we test the hypothesis that placing only the shortest preverbal constituent next to the main-verb explains word order preferences in Hindi.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The principle of DEPENDENCY LENGTH MINIMIZATION, which seeks to keep
syntactically related words close in a sentence, is thought to universally
shape the structure of human languages for effective communication. However,
the extent to which dependency length minimization is applied in human language
systems is not yet fully understood. Preverbally, the placement of
long-before-short constituents and postverbally, short-before-long constituents
are known to minimize overall dependency length of a sentence. In this study,
we test the hypothesis that placing only the shortest preverbal constituent
next to the main-verb explains word order preferences in Hindi (a SOV language)
as opposed to the global minimization of dependency length. We characterize
this approach as a least-effort strategy because it is a cost-effective way to
shorten all dependencies between the verb and its preverbal dependencies. As
such, this approach is consistent with the bounded-rationality perspective
according to which decision making is governed by "fast but frugal" heuristics
rather than by a search for optimal solutions. Consistent with this idea, our
results indicate that actual corpus sentences in the Hindi-Urdu Treebank corpus
are better explained by the least effort strategy than by global minimization
of dependency lengths. Additionally, for the task of distinguishing corpus
sentences from counterfactual variants, we find that the dependency length and
constituent length of the constituent closest to the main verb are much better
predictors of whether a sentence appeared in the corpus than total dependency
length. Overall, our findings suggest that cognitive resource constraints play
a crucial role in shaping natural languages.
Related papers
- Does Dependency Locality Predict Non-canonical Word Order in Hindi? [5.540151072128081]
dependency length minimization is a significant predictor of non-canonical (OSV) syntactic choices.
discourse predictability emerges as the primary determinant of constituent-order preferences.
This work sheds light on the role of expectation adaptation in word-ordering decisions.
arXiv Detail & Related papers (2024-05-13T13:24:17Z) - Work Smarter...Not Harder: Efficient Minimization of Dependency Length in SOV Languages [0.34530027457862006]
Moving a short preverbal constituent next to the main verb explains preverbal constituent ordering decisions better than global minimization of dependency length in SOV languages.
This research sheds light on the role of bounded rationality in linguistic decision-making and language evolution.
arXiv Detail & Related papers (2024-04-29T13:30:27Z) - Syntactic Language Change in English and German: Metrics, Parsers, and Convergences [56.47832275431858]
The current paper looks at diachronic trends in syntactic language change in both English and German, using corpora of parliamentary debates from the last c. 160 years.
We base our observations on five dependencys, including the widely used Stanford Core as well as 4 newer alternatives.
We show that changes in syntactic measures seem to be more frequent at the tails of sentence length distributions.
arXiv Detail & Related papers (2024-02-18T11:46:16Z) - Quantifying the redundancy between prosody and text [67.07817268372743]
We use large language models to estimate how much information is redundant between prosody and the words themselves.
We find a high degree of redundancy between the information carried by the words and prosodic information across several prosodic features.
Still, we observe that prosodic features can not be fully predicted from text, suggesting that prosody carries information above and beyond the words.
arXiv Detail & Related papers (2023-11-28T21:15:24Z) - The distribution of syntactic dependency distances [0.7614628596146599]
We contribute to the characterization of the actual distribution of syntactic dependency distances.
We propose a new double-exponential model in which decay in probability is allowed to change after a break-point.
We find that a two-regime model is the most likely one in all 20 languages we considered.
arXiv Detail & Related papers (2022-11-26T17:31:25Z) - Dependency distance minimization predicts compression [1.2944868613449219]
Dependency distance minimization (DDm) is a well-established principle of word order.
This is a second order prediction because it links a principle with another principle, rather than a principle and a manifestation as in a first order prediction.
We use a recently introduced score that has many mathematical and statistical advantages with respect to the widely used sum of dependency distances.
arXiv Detail & Related papers (2021-09-18T10:53:39Z) - Linguistic dependencies and statistical dependence [76.89273585568084]
We use pretrained language models to estimate probabilities of words in context.
We find that maximum-CPMI trees correspond to linguistic dependencies more often than trees extracted from non-contextual PMI estimate.
arXiv Detail & Related papers (2021-04-18T02:43:37Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z) - The optimality of syntactic dependency distances [0.802904964931021]
We recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network.
We introduce a new score to quantify the cognitive pressure to reduce the distance between linked words in a sentence.
The analysis of sentences from 93 languages reveals that half of languages are optimized to a 70% or more.
arXiv Detail & Related papers (2020-07-30T09:40:41Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - Multiplex Word Embeddings for Selectional Preference Acquisition [70.33531759861111]
We propose a multiplex word embedding model, which can be easily extended according to various relations among words.
Our model can effectively distinguish words with respect to different relations without introducing unnecessary sparseness.
arXiv Detail & Related papers (2020-01-09T04:47:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.