Self-Training for Unsupervised Parsing with PRPN
- URL: http://arxiv.org/abs/2005.13455v1
- Date: Wed, 27 May 2020 16:11:09 GMT
- Title: Self-Training for Unsupervised Parsing with PRPN
- Authors: Anhad Mohananey, Katharina Kann, Samuel R. Bowman
- Abstract summary: We propose self-training for neural UP models.
We leverage aggregated annotations predicted by copies of our model as supervision for future copies.
Our model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1.
- Score: 43.92334181340415
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural unsupervised parsing (UP) models learn to parse without access to
syntactic annotations, while being optimized for another task like language
modeling. In this work, we propose self-training for neural UP models: we
leverage aggregated annotations predicted by copies of our model as supervision
for future copies. To be able to use our model's predictions during training,
we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such
that it can be trained in a semi-supervised fashion. We then add examples with
parses predicted by our model to our unlabeled UP training data. Our
self-trained model outperforms the PRPN by 8.1% F1 and the previous state of
the art by 1.6% F1. In addition, we show that our architecture can also be
helpful for semi-supervised parsing in ultra-low-resource settings.
Related papers
- Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models [29.367678364485794]
We show how to design efficacious data distributions and learning rate schedules for continued pretraining of language models.
We show an improvement of 9% in average model accuracy compared to the baseline of continued training on the pretraining set.
arXiv Detail & Related papers (2024-07-09T22:37:59Z) - Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model.
We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing.
Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Reinforcement Learning with Action-Free Pre-Training from Videos [95.25074614579646]
We introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos.
Our framework significantly improves both final performances and sample-efficiency of vision-based reinforcement learning.
arXiv Detail & Related papers (2022-03-25T19:44:09Z) - UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability
Prediction with Multi-task Learning on Self-Supervised Annotations [0.0]
This work describes a self-supervised data augmentation approach used to improve learning models' performances when only a moderate amount of labeled data is available.
Nerve language models are fine-tuned using this procedure in the context of the AcCompl-it shared task at EVALITA 2020.
arXiv Detail & Related papers (2020-11-10T15:50:37Z) - Is Transfer Learning Necessary for Protein Landscape Prediction? [14.098875826640883]
We show that CNN models trained solely using supervised learning both compete with and sometimes outperform the best models from TAPE.
The benchmarking tasks proposed by TAPE are excellent measures of a model's ability to predict protein function and should be used going forward.
arXiv Detail & Related papers (2020-10-31T20:41:36Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - On the Role of Supervision in Unsupervised Constituency Parsing [59.55128879760495]
A few-shot parsing approach can outperform all the unsupervised parsing methods by a significant margin.
This suggests that, in order to arrive at fair conclusions, we should carefully consider the amount of labeled data used for model development.
arXiv Detail & Related papers (2020-10-06T01:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.