Related papers: Unsupervised and Few-shot Parsing from Pretrained Language Models

Unsupervised and Few-shot Parsing from Pretrained Language Models

URL: http://arxiv.org/abs/2206.04980v1
Date: Fri, 10 Jun 2022 10:29:15 GMT
Title: Unsupervised and Few-shot Parsing from Pretrained Language Models
Authors: Zhiyuan Zeng and Deyi Xiong
Abstract summary: We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model. We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing. Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
Score: 56.33247845224995
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pretrained language models are generally acknowledged to be able to encode syntax [Tenney et al., 2019, Jawahar et al., 2019, Hewitt and Manning, 2019]. In this article, we propose UPOA, an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model as the syntactic distance for span segmentation. We further propose an enhanced version, UPIO, which exploits both inside association and outside association scores for estimating the likelihood of a span. Experiments with UPOA and UPIO disclose that the linear projection matrices for the query and key in the self-attention mechanism play an important role in parsing. We therefore extend the unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few annotated trees to learn better linear projection matrices for parsing. Experiments on the Penn Treebank demonstrate that our unsupervised parsing model UPIO achieves results comparable to the state of the art on short sentences (length <= 10). Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees. Experiments on cross-lingual parsing show that both unsupervised and few-shot parsing methods are better than previous methods on most languages of SPMRL [Seddah et al., 2013].

Related papers

Contextual Distortion Reveals Constituency: Masked Language Models are Implicit Parsers [7.558415495951758]
We propose a novel method for extracting parse trees from masked language models (LMs) Our method computes a score for each span based on the distortion of contextual representations resulting from linguistic perturbations. Our method consistently outperforms previous state-of-the-art methods on English with masked LMs, and also demonstrates superior performance in a multilingual setting.
arXiv Detail & Related papers (2023-06-01T13:10:48Z)
A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing [44.72809363746258]
This paper explores a strong baseline by integrating existing simple parsing strategies, top-down and bottom-up, with various transformer-based pre-trained language models. The experimental results obtained from two benchmark datasets demonstrate that the parsing performance relies on the pretrained language models rather than the parsing strategies.
arXiv Detail & Related papers (2022-10-15T18:38:08Z)
A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities. We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention. Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z)
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency [62.0887259003594]
This work investigates three aspects of structured pruning on multilingual pre-trained language models: settings, algorithms, and efficiency. Experiments on nine downstream tasks show several counter-intuitive phenomena. We present Dynamic Sparsification, a simple approach that allows training the model once and adapting to different model sizes at inference.
arXiv Detail & Related papers (2022-04-06T06:29:52Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Latent Tree Learning with Ordered Neurons: What Parses Does It Produce? [2.025491206574996]
latent tree learning models can learn constituency parsing without exposure to human-annotated tree structures. ON-LSTM is trained on language modelling and has near-state-of-the-art performance on unsupervised parsing. We replicate the model with different restarts and examine their parses.
arXiv Detail & Related papers (2020-10-10T07:12:48Z)
Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input. On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)
Efficient Constituency Parsing by Pointing [21.395573911155495]
We propose a novel constituency parsing model that casts the parsing problem into a series of pointing tasks. Our model supports efficient top-down decoding and our learning objective is able to enforce structural consistency without resorting to the expensive CKY inference.
arXiv Detail & Related papers (2020-06-24T08:29:09Z)
Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers [59.345145623931636]
We argue for a novel cross-lingual transfer paradigm: instance-level selection (ILPS) We present a proof-of-concept study focused on instance-level selection in the framework of delexicalized transfer.
arXiv Detail & Related papers (2020-04-16T13:18:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.