Parsing as Pretraining
- URL: http://arxiv.org/abs/2002.01685v1
- Date: Wed, 5 Feb 2020 08:43:02 GMT
- Title: Parsing as Pretraining
- Authors: David Vilares and Michalina Strzyz and Anders S{\o}gaard and Carlos
G\'omez-Rodr\'iguez
- Abstract summary: We first cast constituent and dependency parsing as sequence tagging.
We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree.
- Score: 13.03764728768944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent analyses suggest that encoders pretrained for language modeling
capture certain morpho-syntactic structure. However, probing frameworks for
word vectors still do not report results on standard setups such as constituent
and dependency parsing. This paper addresses this problem and does full parsing
(on English) relying only on pretraining architectures -- and no decoding. We
first cast constituent and dependency parsing as sequence tagging. We then use
a single feed-forward layer to directly map word vectors to labels that encode
a linearized tree. This is used to: (i) see how far we can reach on syntax
modelling with just pretrained encoders, and (ii) shed some light about the
syntax-sensitivity of different word vectors (by freezing the weights of the
pretraining network during training). For evaluation, we use bracketing
F1-score and LAS, and analyze in-depth differences across representations for
span lengths and dependency displacements. The overall results surpass existing
sequence tagging parsers on the PTB (93.5%) and end-to-end EN-EWT UD (78.8%).
Related papers
- Integrating Supertag Features into Neural Discontinuous Constituent Parsing [0.0]
Traditional views of constituency demand that constituents consist of adjacent words, common in languages like German.
Transition-based parsing produces trees given raw text input using supervised learning on large annotated corpora.
arXiv Detail & Related papers (2024-10-11T12:28:26Z) - MRL Parsing Without Tears: The Case of Hebrew [14.104766026682384]
In morphologically rich languages (MRLs), wheres need to identify multiple lexical units in each token, existing systems suffer in latency and setup complexity.
We present a new "flipped pipeline": decisions are made directly on the whole-token units by expert classifiers, each one dedicated to one specific task.
This blazingly fast approach sets a new SOTA in Hebrew POS tagging and dependency parsing, while also reaching near-SOTA performance on other Hebrew tasks.
arXiv Detail & Related papers (2024-03-11T17:54:33Z) - Assessment of Pre-Trained Models Across Languages and Grammars [7.466159270333272]
We aim to recover constituent and dependency structures by casting parsing as sequence labeling.
Our results show that pre-trained word vectors do not favor constituency representations of syntax over dependencies.
occurrence of a language in the pretraining data is more important than the amount of task data when recovering syntax from the word vectors.
arXiv Detail & Related papers (2023-09-20T09:23:36Z) - Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags.
Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other.
We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z) - Backpack Language Models [108.65930795825416]
We present Backpacks, a new neural architecture that marries strong modeling performance with an interface for interpretability and control.
We find that, after training, sense vectors specialize, each encoding a different aspect of a word.
We present simple algorithms that intervene on sense vectors to perform controllable text generation and debiasing.
arXiv Detail & Related papers (2023-05-26T09:26:23Z) - Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model.
We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing.
Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z) - Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic
Parsing [55.97957664897004]
An effective recipe for building seq2seq, non-autoregressive, task-orienteds to map utterances to semantic frames proceeds in three steps.
These models are typically bottlenecked by length prediction.
In our work, we propose non-autoregressives which shift the decoding task from text generation to span prediction.
arXiv Detail & Related papers (2021-04-15T07:02:35Z) - Strongly Incremental Constituency Parsing with Graph Neural Networks [70.16880251349093]
Parsing sentences into syntax trees can benefit downstream applications in NLP.
Transition-baseds build trees by executing actions in a state transition system.
Existing transition-baseds are predominantly based on the shift-reduce transition system.
arXiv Detail & Related papers (2020-10-27T19:19:38Z) - Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input.
On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.