On the Role of Supervision in Unsupervised Constituency Parsing
- URL: http://arxiv.org/abs/2010.02423v2
- Date: Wed, 7 Oct 2020 01:38:38 GMT
- Title: On the Role of Supervision in Unsupervised Constituency Parsing
- Authors: Haoyue Shi, Karen Livescu, Kevin Gimpel
- Abstract summary: A few-shot parsing approach can outperform all the unsupervised parsing methods by a significant margin.
This suggests that, in order to arrive at fair conclusions, we should carefully consider the amount of labeled data used for model development.
- Score: 59.55128879760495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyze several recent unsupervised constituency parsing models, which are
tuned with respect to the parsing $F_1$ score on the Wall Street Journal (WSJ)
development set (1,700 sentences). We introduce strong baselines for them, by
training an existing supervised parsing model (Kitaev and Klein, 2018) on the
same labeled examples they access. When training on the 1,700 examples, or even
when using only 50 examples for training and 5 for development, such a few-shot
parsing approach can outperform all the unsupervised parsing methods by a
significant margin. Few-shot parsing can be further improved by a simple data
augmentation method and self-training. This suggests that, in order to arrive
at fair conclusions, we should carefully consider the amount of labeled data
used for model development. We propose two protocols for future work on
unsupervised parsing: (i) use fully unsupervised criteria for hyperparameter
tuning and model selection; (ii) use as few labeled examples as possible for
model development, and compare to few-shot parsing trained on the same labeled
examples.
Related papers
- Few-shot Prompting for Pairwise Ranking: An Effective Non-Parametric Retrieval Model [18.111868378615206]
We propose a pairwise few-shot ranker that achieves a close performance to that of a supervised model without requiring any complex training pipeline.
Our method also achieves a close performance to that of a supervised model without requiring any complex training pipeline.
arXiv Detail & Related papers (2024-09-26T11:19:09Z) - Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model.
We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing.
Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Hierarchical Few-Shot Generative Models [18.216729811514718]
We study a latent variables approach that extends the Neural Statistician to a fully hierarchical approach with an attention-based point to set-level aggregation.
Our results show that the hierarchical formulation better captures the intrinsic variability within the sets in the small data regime.
arXiv Detail & Related papers (2021-10-23T19:19:39Z) - Few-shot learning through contextual data augmentation [74.20290390065475]
Machine translation models need to adapt to new data to maintain their performance over time.
We show that adaptation on the scale of one to five examples is possible.
Our model reports better accuracy scores than a reference system trained with on average 313 parallel examples.
arXiv Detail & Related papers (2021-03-31T09:05:43Z) - Self-Training for Unsupervised Parsing with PRPN [43.92334181340415]
We propose self-training for neural UP models.
We leverage aggregated annotations predicted by copies of our model as supervision for future copies.
Our model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1.
arXiv Detail & Related papers (2020-05-27T16:11:09Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.