Related papers: Self-Training for Unsupervised Parsing with PRPN

Self-Training for Unsupervised Parsing with PRPN

URL: http://arxiv.org/abs/2005.13455v1
Date: Wed, 27 May 2020 16:11:09 GMT
Title: Self-Training for Unsupervised Parsing with PRPN
Authors: Anhad Mohananey, Katharina Kann, Samuel R. Bowman
Abstract summary: We propose self-training for neural UP models. We leverage aggregated annotations predicted by copies of our model as supervision for future copies. Our model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1.
Score: 43.92334181340415
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural unsupervised parsing (UP) models learn to parse without access to syntactic annotations, while being optimized for another task like language modeling. In this work, we propose self-training for neural UP models: we leverage aggregated annotations predicted by copies of our model as supervision for future copies. To be able to use our model's predictions during training, we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such that it can be trained in a semi-supervised fashion. We then add examples with parses predicted by our model to our unlabeled UP training data. Our self-trained model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1. In addition, we show that our architecture can also be helpful for semi-supervised parsing in ultra-low-resource settings.

Related papers

Pretraining Language Models to Ponder in Continuous Space [50.52734567589996]
We introduce this pondering process into language models by repeatedly invoking the forward process within a single token generation step.<n>We show that the model can learn to ponder in this way through self-supervised learning, without any human annotations.
arXiv Detail & Related papers (2025-05-27T03:47:33Z)
Can bidirectional encoder become the ultimate winner for downstream applications of foundation models? [1.8120356834558644]
Foundational models have the characteristics of pre-training, transfer learning, and self-supervised learning. BERT broke through the limitation of only using one-way methods for language modeling in pre-training by using a masked language model. This article analyzes one-way and bidirectional models based on GPT and BERT and compares their differences based on the purpose of the model.
arXiv Detail & Related papers (2024-11-27T03:31:14Z)
Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models [29.367678364485794]
We show how to design efficacious data distributions and learning rate schedules for continued pretraining of language models. We show an improvement of 9% in average model accuracy compared to the baseline of continued training on the pretraining set.
arXiv Detail & Related papers (2024-07-09T22:37:59Z)
Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model. We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing. Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z)
Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups. We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective. Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z)
Reinforcement Learning with Action-Free Pre-Training from Videos [95.25074614579646]
We introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos. Our framework significantly improves both final performances and sample-efficiency of vision-based reinforcement learning.
arXiv Detail & Related papers (2022-03-25T19:44:09Z)
UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations [0.0]
This work describes a self-supervised data augmentation approach used to improve learning models' performances when only a moderate amount of labeled data is available. Nerve language models are fine-tuned using this procedure in the context of the AcCompl-it shared task at EVALITA 2020.
arXiv Detail & Related papers (2020-11-10T15:50:37Z)
Is Transfer Learning Necessary for Protein Landscape Prediction? [14.098875826640883]
We show that CNN models trained solely using supervised learning both compete with and sometimes outperform the best models from TAPE. The benchmarking tasks proposed by TAPE are excellent measures of a model's ability to predict protein function and should be used going forward.
arXiv Detail & Related papers (2020-10-31T20:41:36Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
On the Role of Supervision in Unsupervised Constituency Parsing [59.55128879760495]
A few-shot parsing approach can outperform all the unsupervised parsing methods by a significant margin. This suggests that, in order to arrive at fair conclusions, we should carefully consider the amount of labeled data used for model development.
arXiv Detail & Related papers (2020-10-06T01:34:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.