Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?
- URL: http://arxiv.org/abs/2010.04926v1
- Date: Sat, 10 Oct 2020 07:12:48 GMT
- Title: Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?
- Authors: Yian Zhang
- Abstract summary: latent tree learning models can learn constituency parsing without exposure to human-annotated tree structures.
ON-LSTM is trained on language modelling and has near-state-of-the-art performance on unsupervised parsing.
We replicate the model with different restarts and examine their parses.
- Score: 2.025491206574996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent latent tree learning models can learn constituency parsing without any
exposure to human-annotated tree structures. One such model is ON-LSTM (Shen et
al., 2019), which is trained on language modelling and has
near-state-of-the-art performance on unsupervised parsing. In order to better
understand the performance and consistency of the model as well as how the
parses it generates are different from gold-standard PTB parses, we replicate
the model with different restarts and examine their parses. We find that (1)
the model has reasonably consistent parsing behaviors across different
restarts, (2) the model struggles with the internal structures of complex noun
phrases, (3) the model has a tendency to overestimate the height of the split
points right before verbs. We speculate that both problems could potentially be
solved by adopting a different training task other than unidirectional language
modelling.
Related papers
- Split and Rephrase with Large Language Models [2.499907423888049]
Split and Rephrase (SPRP) task consists in splitting complex sentences into a sequence of shorter grammatical sentences.
We evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics.
arXiv Detail & Related papers (2023-12-18T10:16:37Z) - Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains.
How do language models of different sizes learn during pre-training?
Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z) - Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model.
We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing.
Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z) - Second-Order Unsupervised Neural Dependency Parsing [52.331561380948564]
Most unsupervised dependencys are based on first-order probabilistic generative models that only consider local parent-child information.
Inspired by second-order supervised dependency parsing, we proposed a second-order extension of unsupervised neural dependency models that incorporate grandparent-child or sibling information.
Our joint model achieves a 10% improvement over the previous state-of-the-art on the full WSJ test set.
arXiv Detail & Related papers (2020-10-28T03:01:33Z) - Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle [88.65264818967489]
We propose a new syntax-aware language model: Syntactic Ordered Memory (SOM)
The model explicitly models the structure with an incremental and maintains the conditional probability setting of a standard language model.
Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests.
arXiv Detail & Related papers (2020-10-21T17:39:15Z) - Recursive Top-Down Production for Sentence Generation with Latent Trees [77.56794870399288]
We model the production property of context-free grammars for natural and synthetic languages.
We present a dynamic programming algorithm that marginalises over latent binary tree structures with $N$ leaves.
We also present experimental results on German-English translation on the Multi30k dataset.
arXiv Detail & Related papers (2020-10-09T17:47:16Z) - Exploiting Syntactic Structure for Better Language Modeling: A Syntactic
Distance Approach [78.77265671634454]
We make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances"
Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.
arXiv Detail & Related papers (2020-05-12T15:35:00Z) - An enhanced Tree-LSTM architecture for sentence semantic modeling using
typed dependencies [0.0]
Tree-based Long short term memory (LSTM) network has become state-of-the-art for modeling the meaning of language texts.
This paper proposes an enhanced LSTM architecture, called relation gated LSTM, which can model the relationship between two inputs of a sequence.
We also introduce a Tree-LSTM model called Typed Dependency Tree-LSTM that uses the sentence dependency parse structure and the dependency type to embed sentence meaning into a dense vector.
arXiv Detail & Related papers (2020-02-18T18:10:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.