Related papers: Bracketing Encodings for 2-Planar Dependency Parsing

Bracketing Encodings for 2-Planar Dependency Parsing

URL: http://arxiv.org/abs/2011.00596v2
Date: Mon, 22 Mar 2021 20:53:30 GMT
Title: Bracketing Encodings for 2-Planar Dependency Parsing
Authors: Michalina Strzyz, David Vilares and Carlos G\'omez-Rodr\'iguez
Abstract summary: We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels. We take into account the well-known property of 2-planarity, which is present in the vast majority of dependency syntactic structures in treebanks.
Score: 14.653008985229617
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels, hence providing almost total coverage of crossing arcs in sequence labeling parsing. First, we show that existing bracketing encodings for parsing as labeling can only handle a very mild extension of projective trees. Second, we overcome this limitation by taking into account the well-known property of 2-planarity, which is present in the vast majority of dependency syntactic structures in treebanks, i.e., the arcs of a dependency tree can be split into two planes such that arcs in a given plane do not cross. We take advantage of this property to design a method that balances the brackets and that encodes the arcs belonging to each of those planes, allowing for almost unrestricted non-projectivity (round 99.9% coverage) in sequence labeling parsing. The experiments show that our linearizations improve over the accuracy of the original bracketing encoding in highly non-projective treebanks (on average by 0.4 LAS), while achieving a similar speed. Also, they are especially suitable when PoS tags are not used as input parameters to the models.

Related papers

C2T: A Classifier-Based Tree Construction Method in Speculative Decoding [9.663330370149428]
Speculative decoding methods often face inefficiencies in the construction of token trees and the verification of candidate tokens. We propose a novel method named C2T that adopts a lightweight classifier to generate and prune token trees dynamically.
arXiv Detail & Related papers (2025-02-19T11:57:02Z)
Dependency Graph Parsing as Sequence Labeling [18.079016557290338]
We define a range of unbounded and bounded linearizations that can be used to cast graph parsing as a tagging task. Experimental results on semantic dependency and enhanced UD parsing show that with a good choice of encoding, sequence-labeling dependency graphs combine high efficiency with accuracies close to the state of the art.
arXiv Detail & Related papers (2024-10-23T15:37:02Z)
PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer [51.260384040953326]
Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios. We propose a position forest transformer (PosFormer) for HMER, which jointly optimize two tasks: expression recognition and position recognition. PosFormer consistently outperforms the state-of-the-art methods 2.03%/1.22%/2, 1.83%, and 4.62% gains on datasets.
arXiv Detail & Related papers (2024-07-10T15:42:58Z)
4 and 7-bit Labeling for Projective and Non-Projective Dependency Trees [7.466159270333272]
We introduce an encoding for parsing as sequence labeling that can represent any projective dependency tree as a sequence of 4-bit labels, one per word. Results on a set of diverse treebanks show that our 7-bit encoding obtains substantial accuracy gains over the previously best-performing sequence labeling encodings.
arXiv Detail & Related papers (2023-10-22T14:43:53Z)
Structured Dialogue Discourse Parsing [79.37200787463917]
discourse parsing aims to uncover the internal structure of a multi-participant conversation. We propose a principled method that improves upon previous work from two perspectives: encoding and decoding. Experiments show that our method achieves new state-of-the-art, surpassing the previous model by 2.3 on STAC and 1.5 on Molweni.
arXiv Detail & Related papers (2023-06-26T22:51:01Z)
Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z)
Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective [97.57162770792182]
Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity can be reduced to linear by casting the relation between tokens as a partial order over the string. Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string.
arXiv Detail & Related papers (2023-05-24T11:47:35Z)
Integrating Dependency Tree Into Self-attention for Sentence Representation [9.676884071651205]
We propose Dependency-Transformer, which applies a relation-attention mechanism that works in concert with the self-attention mechanism. By a score-based method, we successfully inject the syntax information without affecting Transformer's parallelizability. Our model outperforms or is comparable to the state-of-the-art methods on four tasks for sentence representation.
arXiv Detail & Related papers (2022-03-11T13:44:41Z)
Please Mind the Root: Decoding Arborescences for Dependency Parsing [67.71280539312536]
We analyze the output of state-of-the-arts on many languages from the Universal Dependency Treebank. The worst constraint-violation rate we observe is 24%.
arXiv Detail & Related papers (2020-10-06T08:31:14Z)
Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input. On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.