Contextual Distortion Reveals Constituency: Masked Language Models are
Implicit Parsers
- URL: http://arxiv.org/abs/2306.00645v1
- Date: Thu, 1 Jun 2023 13:10:48 GMT
- Title: Contextual Distortion Reveals Constituency: Masked Language Models are
Implicit Parsers
- Authors: Jiaxi Li and Wei Lu
- Abstract summary: We propose a novel method for extracting parse trees from masked language models (LMs)
Our method computes a score for each span based on the distortion of contextual representations resulting from linguistic perturbations.
Our method consistently outperforms previous state-of-the-art methods on English with masked LMs, and also demonstrates superior performance in a multilingual setting.
- Score: 7.558415495951758
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in pre-trained language models (PLMs) have demonstrated
that these models possess some degree of syntactic awareness. To leverage this
knowledge, we propose a novel chart-based method for extracting parse trees
from masked language models (LMs) without the need to train separate parsers.
Our method computes a score for each span based on the distortion of contextual
representations resulting from linguistic perturbations. We design a set of
perturbations motivated by the linguistic concept of constituency tests, and
use these to score each span by aggregating the distortion scores. To produce a
parse tree, we use chart parsing to find the tree with the minimum score. Our
method consistently outperforms previous state-of-the-art methods on English
with masked LMs, and also demonstrates superior performance in a multilingual
setting, outperforming the state of the art in 6 out of 8 languages. Notably,
although our method does not involve parameter updates or extensive
hyperparameter search, its performance can even surpass some unsupervised
parsing methods that require fine-tuning. Our analysis highlights that the
distortion of contextual representation resulting from syntactic perturbation
can serve as an effective indicator of constituency across languages.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Transparency at the Source: Evaluating and Interpreting Language Models
With Access to the True Distribution [4.01799362940916]
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data.
The data is generated using a massive probabilistic grammar, that is itself derived from a large natural language corpus.
With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.
arXiv Detail & Related papers (2023-10-23T12:03:01Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and
Semantic Parsing [55.058258437125524]
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing.
We benchmark eight language models, including two GPT-3 variants available only through an API.
Our experiments show that encoder-decoder pretrained language models can achieve similar performance or surpass state-of-the-art methods for syntactic and semantic parsing when the model output is constrained to be valid.
arXiv Detail & Related papers (2022-06-21T18:34:11Z) - Multilingual Syntax-aware Language Modeling through Dependency Tree
Conversion [12.758523394180695]
We study the effect on neural language models (LMs) performance across nine conversion methods and five languages.
On average, the performance of our best model represents a 19 % increase in accuracy over the worst choice across all languages.
Our experiments highlight the importance of choosing the right tree formalism, and provide insights into making an informed decision.
arXiv Detail & Related papers (2022-04-19T03:56:28Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Multilingual Chart-based Constituency Parse Extraction from Pre-trained
Language Models [21.2879567125422]
We propose a novel method for extracting complete (binary) parses from pre-trained language models.
By applying our method on multilingual PLMs, it becomes possible to induce non-trivial parses for sentences from nine languages.
arXiv Detail & Related papers (2020-04-08T05:42:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.