Unsupervised and Few-shot Parsing from Pretrained Language Models
- URL: http://arxiv.org/abs/2206.04980v1
- Date: Fri, 10 Jun 2022 10:29:15 GMT
- Title: Unsupervised and Few-shot Parsing from Pretrained Language Models
- Authors: Zhiyuan Zeng and Deyi Xiong
- Abstract summary: We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model.
We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing.
Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
- Score: 56.33247845224995
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pretrained language models are generally acknowledged to be able to encode
syntax [Tenney et al., 2019, Jawahar et al., 2019, Hewitt and Manning, 2019].
In this article, we propose UPOA, an Unsupervised constituent Parsing model
that calculates an Out Association score solely based on the self-attention
weight matrix learned in a pretrained language model as the syntactic distance
for span segmentation. We further propose an enhanced version, UPIO, which
exploits both inside association and outside association scores for estimating
the likelihood of a span. Experiments with UPOA and UPIO disclose that the
linear projection matrices for the query and key in the self-attention
mechanism play an important role in parsing. We therefore extend the
unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few
annotated trees to learn better linear projection matrices for parsing.
Experiments on the Penn Treebank demonstrate that our unsupervised parsing
model UPIO achieves results comparable to the state of the art on short
sentences (length <= 10). Our few-shot parsing model FPIO trained with only 20
annotated trees outperforms a previous few-shot parsing method trained with 50
annotated trees. Experiments on cross-lingual parsing show that both
unsupervised and few-shot parsing methods are better than previous methods on
most languages of SPMRL [Seddah et al., 2013].
Related papers
- Contextual Distortion Reveals Constituency: Masked Language Models are
Implicit Parsers [7.558415495951758]
We propose a novel method for extracting parse trees from masked language models (LMs)
Our method computes a score for each span based on the distortion of contextual representations resulting from linguistic perturbations.
Our method consistently outperforms previous state-of-the-art methods on English with masked LMs, and also demonstrates superior performance in a multilingual setting.
arXiv Detail & Related papers (2023-06-01T13:10:48Z) - A Simple and Strong Baseline for End-to-End Neural RST-style Discourse
Parsing [44.72809363746258]
This paper explores a strong baseline by integrating existing simple parsing strategies, top-down and bottom-up, with various transformer-based pre-trained language models.
The experimental results obtained from two benchmark datasets demonstrate that the parsing performance relies on the pretrained language models rather than the parsing strategies.
arXiv Detail & Related papers (2022-10-15T18:38:08Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Probing Structured Pruning on Multilingual Pre-trained Models: Settings,
Algorithms, and Efficiency [62.0887259003594]
This work investigates three aspects of structured pruning on multilingual pre-trained language models: settings, algorithms, and efficiency.
Experiments on nine downstream tasks show several counter-intuitive phenomena.
We present Dynamic Sparsification, a simple approach that allows training the model once and adapting to different model sizes at inference.
arXiv Detail & Related papers (2022-04-06T06:29:52Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Latent Tree Learning with Ordered Neurons: What Parses Does It Produce? [2.025491206574996]
latent tree learning models can learn constituency parsing without exposure to human-annotated tree structures.
ON-LSTM is trained on language modelling and has near-state-of-the-art performance on unsupervised parsing.
We replicate the model with different restarts and examine their parses.
arXiv Detail & Related papers (2020-10-10T07:12:48Z) - Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input.
On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z) - Efficient Constituency Parsing by Pointing [21.395573911155495]
We propose a novel constituency parsing model that casts the parsing problem into a series of pointing tasks.
Our model supports efficient top-down decoding and our learning objective is able to enforce structural consistency without resorting to the expensive CKY inference.
arXiv Detail & Related papers (2020-06-24T08:29:09Z) - Towards Instance-Level Parser Selection for Cross-Lingual Transfer of
Dependency Parsers [59.345145623931636]
We argue for a novel cross-lingual transfer paradigm: instance-level selection (ILPS)
We present a proof-of-concept study focused on instance-level selection in the framework of delexicalized transfer.
arXiv Detail & Related papers (2020-04-16T13:18:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.