Second-Order Unsupervised Neural Dependency Parsing
- URL: http://arxiv.org/abs/2010.14720v1
- Date: Wed, 28 Oct 2020 03:01:33 GMT
- Title: Second-Order Unsupervised Neural Dependency Parsing
- Authors: Songlin Yang, Yong Jiang, Wenjuan Han, Kewei Tu
- Abstract summary: Most unsupervised dependencys are based on first-order probabilistic generative models that only consider local parent-child information.
Inspired by second-order supervised dependency parsing, we proposed a second-order extension of unsupervised neural dependency models that incorporate grandparent-child or sibling information.
Our joint model achieves a 10% improvement over the previous state-of-the-art on the full WSJ test set.
- Score: 52.331561380948564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of the unsupervised dependency parsers are based on first-order
probabilistic generative models that only consider local parent-child
information. Inspired by second-order supervised dependency parsing, we
proposed a second-order extension of unsupervised neural dependency models that
incorporate grandparent-child or sibling information. We also propose a novel
design of the neural parameterization and optimization methods of the
dependency models. In second-order models, the number of grammar rules grows
cubically with the increase of vocabulary size, making it difficult to train
lexicalized models that may contain thousands of words. To circumvent this
problem while still benefiting from both second-order parsing and
lexicalization, we use the agreement-based learning framework to jointly train
a second-order unlexicalized model and a first-order lexicalized model.
Experiments on multiple datasets show the effectiveness of our second-order
models compared with recent state-of-the-art methods. Our joint model achieves
a 10% improvement over the previous state-of-the-art parser on the full WSJ
test set
Related papers
- Learning to Diversify Neural Text Generation via Degenerative Model [39.961572541752005]
We propose a new approach to prevent degeneration problems by training two models.
We first train a model that is designed to amplify undesirable patterns.
We then enhance the diversity of the second model by focusing on patterns that the first model fails to learn.
arXiv Detail & Related papers (2023-09-22T04:57:10Z) - Artificial Interrogation for Attributing Language Models [0.0]
The challenge provides twelve open-sourced base versions of popular language models and twelve fine-tuned language models for text generation.
The goal of the contest is to identify which fine-tuned models originated from which base model.
We have employed four distinct approaches for measuring the resemblance between the responses generated from the models of both sets.
arXiv Detail & Related papers (2022-11-20T05:46:29Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model.
We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing.
Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z) - Improving Contrastive Learning with Model Augmentation [123.05700988581806]
The sequential recommendation aims at predicting the next items in user behaviors, which can be solved by characterizing item relationships in sequences.
Due to the data sparsity and noise issues in sequences, a new self-supervised learning (SSL) paradigm is proposed to improve the performance.
arXiv Detail & Related papers (2022-03-25T06:12:58Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Neural Unsupervised Semantic Role Labeling [48.69930912510414]
We present the first neural unsupervised model for semantic role labeling.
We decompose the task as two argument related subtasks, identification and clustering.
Experiments on CoNLL-2009 English dataset demonstrate that our model outperforms previous state-of-the-art baseline.
arXiv Detail & Related papers (2021-04-19T04:50:16Z) - StructFormer: Joint Unsupervised Induction of Dependency and
Constituency Structure from Masked Language Modeling [45.96663013609177]
We introduce a novel model, StructFormer, that can induce dependency and constituency structure at the same time.
We integrate the induced dependency relations into the transformer, in a differentiable manner, through a novel dependency-constrained self-attention mechanism.
Experimental results show that our model can achieve strong results on unsupervised constituency parsing, unsupervised dependency parsing, and masked language modeling.
arXiv Detail & Related papers (2020-12-01T21:54:51Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Latent Tree Learning with Ordered Neurons: What Parses Does It Produce? [2.025491206574996]
latent tree learning models can learn constituency parsing without exposure to human-annotated tree structures.
ON-LSTM is trained on language modelling and has near-state-of-the-art performance on unsupervised parsing.
We replicate the model with different restarts and examine their parses.
arXiv Detail & Related papers (2020-10-10T07:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.