Bracketing Encodings for 2-Planar Dependency Parsing
- URL: http://arxiv.org/abs/2011.00596v2
- Date: Mon, 22 Mar 2021 20:53:30 GMT
- Title: Bracketing Encodings for 2-Planar Dependency Parsing
- Authors: Michalina Strzyz, David Vilares and Carlos G\'omez-Rodr\'iguez
- Abstract summary: We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels.
We take into account the well-known property of 2-planarity, which is present in the vast majority of dependency syntactic structures in treebanks.
- Score: 14.653008985229617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a bracketing-based encoding that can be used to represent any
2-planar dependency tree over a sentence of length n as a sequence of n labels,
hence providing almost total coverage of crossing arcs in sequence labeling
parsing. First, we show that existing bracketing encodings for parsing as
labeling can only handle a very mild extension of projective trees. Second, we
overcome this limitation by taking into account the well-known property of
2-planarity, which is present in the vast majority of dependency syntactic
structures in treebanks, i.e., the arcs of a dependency tree can be split into
two planes such that arcs in a given plane do not cross. We take advantage of
this property to design a method that balances the brackets and that encodes
the arcs belonging to each of those planes, allowing for almost unrestricted
non-projectivity (round 99.9% coverage) in sequence labeling parsing. The
experiments show that our linearizations improve over the accuracy of the
original bracketing encoding in highly non-projective treebanks (on average by
0.4 LAS), while achieving a similar speed. Also, they are especially suitable
when PoS tags are not used as input parameters to the models.
Related papers
- Dependency Graph Parsing as Sequence Labeling [18.079016557290338]
We define a range of unbounded and bounded linearizations that can be used to cast graph parsing as a tagging task.
Experimental results on semantic dependency and enhanced UD parsing show that with a good choice of encoding, sequence-labeling dependency graphs combine high efficiency with accuracies close to the state of the art.
arXiv Detail & Related papers (2024-10-23T15:37:02Z) - PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer [51.260384040953326]
Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios.
We propose a position forest transformer (PosFormer) for HMER, which jointly optimize two tasks: expression recognition and position recognition.
PosFormer consistently outperforms the state-of-the-art methods 2.03%/1.22%/2, 1.83%, and 4.62% gains on datasets.
arXiv Detail & Related papers (2024-07-10T15:42:58Z) - 4 and 7-bit Labeling for Projective and Non-Projective Dependency Trees [7.466159270333272]
We introduce an encoding for parsing as sequence labeling that can represent any projective dependency tree as a sequence of 4-bit labels, one per word.
Results on a set of diverse treebanks show that our 7-bit encoding obtains substantial accuracy gains over the previously best-performing sequence labeling encodings.
arXiv Detail & Related papers (2023-10-22T14:43:53Z) - Structured Dialogue Discourse Parsing [79.37200787463917]
discourse parsing aims to uncover the internal structure of a multi-participant conversation.
We propose a principled method that improves upon previous work from two perspectives: encoding and decoding.
Experiments show that our method achieves new state-of-the-art, surpassing the previous model by 2.3 on STAC and 1.5 on Molweni.
arXiv Detail & Related papers (2023-06-26T22:51:01Z) - Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags.
Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other.
We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z) - Linear-Time Modeling of Linguistic Structure: An Order-Theoretic
Perspective [97.57162770792182]
Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language.
We show that these exhaustive comparisons can be avoided, and, moreover, the complexity can be reduced to linear by casting the relation between tokens as a partial order over the string.
Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string.
arXiv Detail & Related papers (2023-05-24T11:47:35Z) - Integrating Dependency Tree Into Self-attention for Sentence
Representation [9.676884071651205]
We propose Dependency-Transformer, which applies a relation-attention mechanism that works in concert with the self-attention mechanism.
By a score-based method, we successfully inject the syntax information without affecting Transformer's parallelizability.
Our model outperforms or is comparable to the state-of-the-art methods on four tasks for sentence representation.
arXiv Detail & Related papers (2022-03-11T13:44:41Z) - Please Mind the Root: Decoding Arborescences for Dependency Parsing [67.71280539312536]
We analyze the output of state-of-the-arts on many languages from the Universal Dependency Treebank.
The worst constraint-violation rate we observe is 24%.
arXiv Detail & Related papers (2020-10-06T08:31:14Z) - Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input.
On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.