On Finding the $K$-best Non-projective Dependency Trees
- URL: http://arxiv.org/abs/2106.00780v1
- Date: Tue, 1 Jun 2021 20:23:41 GMT
- Title: On Finding the $K$-best Non-projective Dependency Trees
- Authors: Ran Zmigrod, Tim Vieira, Ryan Cotterell
- Abstract summary: We provide a simplification of the $K$-best spanning tree algorithm of Camerini et al. (1980).
We present a novel extension of the algorithm for decoding the $K$-best dependency trees of a graph which are subject to a root constraint.
- Score: 39.5549669100436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The connection between the maximum spanning tree in a directed graph and the
best dependency tree of a sentence has been exploited by the NLP community.
However, for many dependency parsing schemes, an important detail of this
approach is that the spanning tree must have exactly one edge emanating from
the root. While work has been done to efficiently solve this problem for
finding the one-best dependency tree, no research has attempted to extend this
solution to finding the $K$-best dependency trees. This is arguably a more
important extension as a larger proportion of decoded trees will not be subject
to the root constraint of dependency trees. Indeed, we show that the rate of
root constraint violations increases by an average of $13$ times when decoding
with $K\!=\!50$ as opposed to $K\!=\!1$. In this paper, we provide a
simplification of the $K$-best spanning tree algorithm of Camerini et al.
(1980). Our simplification allows us to obtain a constant time speed-up over
the original algorithm. Furthermore, we present a novel extension of the
algorithm for decoding the $K$-best dependency trees of a graph which are
subject to a root constraint.
Related papers
- Unbiased and Efficient Sampling of Dependency Trees [0.0]
Most treebanks require that every valid dependency tree has a single edge coming out of the ROOT node.
Zmigrod et al. have recently proposed algorithms for sampling with and without replacement from the single-root dependency tree distribution.
We show that their fastest algorithm for sampling with replacement, Wilson-RC, is in fact biased.
arXiv Detail & Related papers (2022-05-25T09:57:28Z) - Shallow decision trees for explainable $k$-means clustering [1.2891210250935146]
We propose an efficient algorithm that takes into account metrics related to the depths of the leaves in a decision tree.
In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms.
arXiv Detail & Related papers (2021-12-29T18:22:28Z) - Efficient Sampling of Dependency Structures [39.5549669100436]
We adapt two spanning tree sampling algorithms to faithfully sample dependency trees from a graph subject to a root constraint.
We build upon Colbourn's algorithm and present a novel extension that can sample $K$ trees without replacement in $mathcalO(K N3 + K2 N)$ time.
arXiv Detail & Related papers (2021-09-14T08:33:12Z) - Properly learning decision trees in almost polynomial time [25.763690981846125]
We give an $nO(loglog n)$-time membership query algorithm for properly and agnostically learning decision trees.
Our algorithm shares similarities with practicals for learning decision trees.
We show how every decision tree can be "pruned" so that every variable in the resulting tree is influential.
arXiv Detail & Related papers (2021-09-01T22:12:47Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - An Efficient Adversarial Attack for Tree Ensembles [91.05779257472675]
adversarial attacks on tree based ensembles such as gradient boosting decision trees (DTs) and random forests (RFs)
We show that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach.
Our code is available at https://chong-z/tree-ensemble-attack.
arXiv Detail & Related papers (2020-10-22T10:59:49Z) - Please Mind the Root: Decoding Arborescences for Dependency Parsing [67.71280539312536]
We analyze the output of state-of-the-arts on many languages from the Universal Dependency Treebank.
The worst constraint-violation rate we observe is 24%.
arXiv Detail & Related papers (2020-10-06T08:31:14Z) - On $\ell_p$-norm Robustness of Ensemble Stumps and Trees [83.81523991945018]
We develop an efficient programming based algorithm for sound verification of ensemble stumps.
We demonstrate the first certified defense method for training ensemble stumps and trees with respect to $ell_p$ norm perturbations.
arXiv Detail & Related papers (2020-08-20T03:42:40Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.