Linear-time calculation of the expected sum of edge lengths in random
projective linearizations of trees
- URL: http://arxiv.org/abs/2107.03277v1
- Date: Wed, 7 Jul 2021 15:11:53 GMT
- Title: Linear-time calculation of the expected sum of edge lengths in random
projective linearizations of trees
- Authors: Llu\'is Alemany-Puig and Ramon Ferrer-i-Cancho
- Abstract summary: The sum of distances between syntactically related words has been in the limelight for the past decades.
Various random baselines have been defined to carry out related quantitative studies on languages.
Here we focus on a popular baseline: random projective permutations of the words of the sentence.
- Score: 1.2944868613449219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The syntactic structure of a sentence is often represented using syntactic
dependency trees. The sum of the distances between syntactically related words
has been in the limelight for the past decades. Research on dependency
distances led to the formulation of the principle of dependency distance
minimization whereby words in sentences are ordered so as to minimize that sum.
Numerous random baselines have been defined to carry out related quantitative
studies on languages. The simplest random baseline is the expected value of the
sum in unconstrained random permutations of the words in the sentence, namely
when all the shufflings of the words of a sentence are allowed and equally
likely. Here we focus on a popular baseline: random projective permutations of
the words of the sentence, that is, permutations where the syntactic dependency
structure is projective, a formal constraint that sentences satisfy often in
languages. Thus far, the expectation of the sum of dependency distances in
random projective shufflings of a sentence has been estimated approximately
with a Monte Carlo procedure whose cost is of the order of $Zn$, where $n$ is
the number of words of the sentence and $Z$ is the number of samples; the
larger $Z$, the lower the error of the estimation but the larger the time cost.
Here we present formulae to compute that expectation without error in time of
the order of $n$. Furthermore, we show that star trees maximize it, and devise
a dynamic programming algorithm to retrieve the trees that minimize it.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Revisiting the Optimality of Word Lengths [92.70590105707639]
Communicative cost can be operationalized in different ways.
Zipf (1935) posited that wordforms are optimized to minimize utterances' communicative costs.
arXiv Detail & Related papers (2023-12-06T20:41:47Z) - Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits [53.281230333364505]
This paper studies the problem of contextual dueling bandits, where the binary comparison of dueling arms is generated from a generalized linear model (GLM)
We propose a new SupLinUCB-type algorithm that enjoys computational efficiency and a variance-aware regret bound $tilde Obig(dsqrtsum_t=1Tsigma_t2 + dbig)$.
Our regret bound naturally aligns with the intuitive expectation in scenarios where the comparison is deterministic, the algorithm only suffers from an $tilde O(d)$ regret.
arXiv Detail & Related papers (2023-10-02T08:15:52Z) - The distribution of syntactic dependency distances [0.7614628596146599]
We contribute to the characterization of the actual distribution of syntactic dependency distances.
We propose a new double-exponential model in which decay in probability is allowed to change after a break-point.
We find that a two-regime model is the most likely one in all 20 languages we considered.
arXiv Detail & Related papers (2022-11-26T17:31:25Z) - Truncation Sampling as Language Model Desmoothing [115.28983143361681]
Long samples of text from neural language models can be of poor quality.
Truncation sampling algorithms set some words' probabilities to zero at each step.
We introduce $eta$-sampling, which truncates words below an entropy-dependent probability threshold.
arXiv Detail & Related papers (2022-10-27T05:52:35Z) - The expected sum of edge lengths in planar linearizations of trees.
Theory and applications [0.16317061277456998]
We show the relationship between the expected sum in planar arrangements and the expected sum in projective arrangements.
We derive a $O(n)$-time algorithm to calculate the expected value of the sum of edge lengths.
We apply this research to a parallel corpus and find that the gap between actual dependency distance and the random baseline reduces as the strength of the formal constraint on dependency structures increases.
arXiv Detail & Related papers (2022-07-12T14:35:07Z) - Shallow decision trees for explainable $k$-means clustering [1.2891210250935146]
We propose an efficient algorithm that takes into account metrics related to the depths of the leaves in a decision tree.
In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms.
arXiv Detail & Related papers (2021-12-29T18:22:28Z) - Dependency distance minimization predicts compression [1.2944868613449219]
Dependency distance minimization (DDm) is a well-established principle of word order.
This is a second order prediction because it links a principle with another principle, rather than a principle and a manifestation as in a first order prediction.
We use a recently introduced score that has many mathematical and statistical advantages with respect to the widely used sum of dependency distances.
arXiv Detail & Related papers (2021-09-18T10:53:39Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - Estimating decision tree learnability with polylogarithmic sample
complexity [16.832966312395126]
We show that top-down decision tree learnings are amenable to highly efficient learnability estimation.
We show how the label $T(xstar$)$ of the decision tree hypothesis $T$ can be computed with polylogarithmically many labeled examples.
arXiv Detail & Related papers (2020-11-03T09:26:27Z) - Stochastic Linear Bandits Robust to Adversarial Attacks [117.665995707568]
We provide two variants of a Robust Phased Elimination algorithm, one that knows $C$ and one that does not.
We show that both variants attain near-optimal regret in the non-corrupted case $C = 0$, while incurring additional additive terms respectively.
In a contextual setting, we show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing $C$.
arXiv Detail & Related papers (2020-07-07T09:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.