Related papers: Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models

Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models

URL: http://arxiv.org/abs/2211.00479v1
Date: Thu, 15 Sep 2022 09:41:19 GMT
Title: Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models
Authors: Taeuk Kim
Abstract summary: Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models. We show that CPE-PLM is more effective than typical supervised parsing in few-shot settings.
Score: 6.850627526999892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models. While attractive in the perspective that similar to in-context learning, it does not require task-specific fine-tuning, the practical effectiveness of such an approach still remains unclear, except that it can function as a probe for investigating language models' inner workings. In this work, we mathematically reformulate CPE-PLM and propose two advanced ensemble methods tailored for it, demonstrating that the new parsing paradigm can be competitive with common unsupervised parsers by introducing a set of heterogeneous PLMs combined using our techniques. Furthermore, we explore some scenarios where the trees generated by CPE-PLM are practically useful. Specifically, we show that CPE-PLM is more effective than typical supervised parsers in few-shot settings.

Related papers

Distilling Monolingual and Crosslingual Word-in-Context Representations [18.87665111304974]
We propose a method that distils representations of word meaning in context from a pre-trained language model in both monolingual and crosslingual settings. Our method does not require human-annotated corpora nor updates of the parameters of the pre-trained model. Our method learns to combine the outputs of different hidden layers of the pre-trained model using self-attention.
arXiv Detail & Related papers (2024-09-13T11:10:16Z)
Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models [22.977852629450346]
We propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models. In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture. Our experiment results show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters.
arXiv Detail & Related papers (2023-10-24T23:29:06Z)
PIP: Parse-Instructed Prefix for Syntactically Controlled Paraphrase Generation [61.05254852400895]
Parse-Instructed Prefix (PIP) is a novel adaptation of prefix-tuning to tune large pre-trained language models. In contrast to traditional fine-tuning methods for this task, PIP is a compute-efficient alternative with 10 times less learnable parameters.
arXiv Detail & Related papers (2023-05-26T07:42:38Z)
Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z)
Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning [97.28779163988833]
Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling. We propose textitMOMETAS, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives.
arXiv Detail & Related papers (2022-10-19T04:38:26Z)
A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing [44.72809363746258]
This paper explores a strong baseline by integrating existing simple parsing strategies, top-down and bottom-up, with various transformer-based pre-trained language models. The experimental results obtained from two benchmark datasets demonstrate that the parsing performance relies on the pretrained language models rather than the parsing strategies.
arXiv Detail & Related papers (2022-10-15T18:38:08Z)
Unsupervised and Few-shot Parsing from Pretrained Language Models [56.33247845224995]
We propose an Unsupervised constituent Parsing model that calculates an Out Association score solely based on the self-attention weight matrix learned in a pretrained language model. We extend the unsupervised models to few-shot parsing models that use a few annotated trees to learn better linear projection matrices for parsing. Our few-shot parsing model FPIO trained with only 20 annotated trees outperforms a previous few-shot parsing method trained with 50 annotated trees.
arXiv Detail & Related papers (2022-06-10T10:29:15Z)
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency [62.0887259003594]
This work investigates three aspects of structured pruning on multilingual pre-trained language models: settings, algorithms, and efficiency. Experiments on nine downstream tasks show several counter-intuitive phenomena. We present Dynamic Sparsification, a simple approach that allows training the model once and adapting to different model sizes at inference.
arXiv Detail & Related papers (2022-04-06T06:29:52Z)
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks [112.1942546460814]
We report the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM) Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models.
arXiv Detail & Related papers (2022-03-31T03:26:55Z)
Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks. We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models [21.2879567125422]
We propose a novel method for extracting complete (binary) parses from pre-trained language models. By applying our method on multilingual PLMs, it becomes possible to induce non-trivial parses for sentences from nine languages.
arXiv Detail & Related papers (2020-04-08T05:42:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.