Dependency distance minimization predicts compression
- URL: http://arxiv.org/abs/2109.08900v1
- Date: Sat, 18 Sep 2021 10:53:39 GMT
- Title: Dependency distance minimization predicts compression
- Authors: Ramon Ferrer-i-Cancho and Carlos G\'omez-Rodr\'iguez
- Abstract summary: Dependency distance minimization (DDm) is a well-established principle of word order.
This is a second order prediction because it links a principle with another principle, rather than a principle and a manifestation as in a first order prediction.
We use a recently introduced score that has many mathematical and statistical advantages with respect to the widely used sum of dependency distances.
- Score: 1.2944868613449219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dependency distance minimization (DDm) is a well-established principle of
word order. It has been predicted theoretically that DDm implies compression,
namely the minimization of word lengths. This is a second order prediction
because it links a principle with another principle, rather than a principle
and a manifestation as in a first order prediction. Here we test that second
order prediction with a parallel collection of treebanks controlling for
annotation style with Universal Dependencies and Surface-Syntactic Universal
Dependencies. To test it, we use a recently introduced score that has many
mathematical and statistical advantages with respect to the widely used sum of
dependency distances. We find that the prediction is confirmed by the new score
when word lengths are measured in phonemes, independently of the annotation
style, but not when word lengths are measured in syllables. In contrast, one of
the most widely used scores, i.e. the sum of dependency distances, fails to
confirm that prediction, showing the weakness of raw dependency distances for
research on word order. Finally, our findings expand the theory of natural
communication by linking two distinct levels of organization, namely syntax
(word order) and word internal structure.
Related papers
- Surprise! Uniform Information Density Isn't the Whole Story: Predicting Surprisal Contours in Long-form Discourse [54.08750245737734]
We propose that speakers modulate information rate based on location within a hierarchically-structured model of discourse.
We find that hierarchical predictors are significant predictors of a discourse's information contour and that deeply nested hierarchical predictors are more predictive than shallow ones.
arXiv Detail & Related papers (2024-10-21T14:42:37Z) - The optimal placement of the head in the noun phrase. The case of demonstrative, numeral, adjective and noun [0.16317061277456998]
We show that, across preferred orders in languages, the noun tends to be placed at one of the ends.
We also show evidence of anti locality effects: syntactic dependency in preferred orders are longer than expected by chance.
arXiv Detail & Related papers (2024-02-15T20:24:39Z) - Revisiting the Optimality of Word Lengths [92.70590105707639]
Communicative cost can be operationalized in different ways.
Zipf (1935) posited that wordforms are optimized to minimize utterances' communicative costs.
arXiv Detail & Related papers (2023-12-06T20:41:47Z) - Testing the Predictions of Surprisal Theory in 11 Languages [77.45204595614]
We investigate the relationship between surprisal and reading times in eleven different languages.
By focusing on a more diverse set of languages, we argue that these results offer the most robust link to-date between information theory and incremental language processing across languages.
arXiv Detail & Related papers (2023-07-07T15:37:50Z) - A bounded rationality account of dependency length minimization in Hindi [0.0]
The principle of DEPENDENCY LENGTH MINIMIZATION is thought to shape the structure of human languages for effective communication.
Preverbally, the placement of long-before-short constituents and postverbally, short-before-long constituents are known to minimize overall dependency length of a sentence.
In this study, we test the hypothesis that placing only the shortest preverbal constituent next to the main-verb explains word order preferences in Hindi.
arXiv Detail & Related papers (2023-04-22T13:53:50Z) - Direct and indirect evidence of compression of word lengths. Zipf's law
of abbreviation revisited [0.4893345190925177]
Zipf's law of abbreviation, the tendency of more frequent words to be shorter, is one of the most solid candidates for a linguistic universal.
We provide evidence that the law holds also in speech (when word length is measured in time), in particular in 46 languages from 14 linguistic families.
Motivated by the need of direct evidence of compression, we derive a simple formula for a random baseline indicating that word lengths are systematically below chance.
arXiv Detail & Related papers (2023-03-17T17:12:18Z) - The distribution of syntactic dependency distances [0.7614628596146599]
We contribute to the characterization of the actual distribution of syntactic dependency distances.
We propose a new double-exponential model in which decay in probability is allowed to change after a break-point.
We find that a two-regime model is the most likely one in all 20 languages we considered.
arXiv Detail & Related papers (2022-11-26T17:31:25Z) - The expected sum of edge lengths in planar linearizations of trees.
Theory and applications [0.16317061277456998]
We show the relationship between the expected sum in planar arrangements and the expected sum in projective arrangements.
We derive a $O(n)$-time algorithm to calculate the expected value of the sum of edge lengths.
We apply this research to a parallel corpus and find that the gap between actual dependency distance and the random baseline reduces as the strength of the formal constraint on dependency structures increases.
arXiv Detail & Related papers (2022-07-12T14:35:07Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Linguistic dependencies and statistical dependence [76.89273585568084]
We use pretrained language models to estimate probabilities of words in context.
We find that maximum-CPMI trees correspond to linguistic dependencies more often than trees extracted from non-contextual PMI estimate.
arXiv Detail & Related papers (2021-04-18T02:43:37Z) - NLP-CIC @ DIACR-Ita: POS and Neighbor Based Distributional Models for
Lexical Semantic Change in Diachronic Italian Corpora [62.997667081978825]
We present our systems and findings on unsupervised lexical semantic change for the Italian language.
The task is to determine whether a target word has evolved its meaning with time, only relying on raw-text from two time-specific datasets.
We propose two models representing the target words across the periods to predict the changing words using threshold and voting schemes.
arXiv Detail & Related papers (2020-11-07T11:27:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.