Probing for Understanding of English Verb Classes and Alternations in
Large Pre-trained Language Models
- URL: http://arxiv.org/abs/2209.04811v1
- Date: Sun, 11 Sep 2022 08:04:40 GMT
- Title: Probing for Understanding of English Verb Classes and Alternations in
Large Pre-trained Language Models
- Authors: David K. Yi, James V. Bruno, Jiayu Han, Peter Zukerman, Shane
Steinert-Threlkeld
- Abstract summary: We investigate the extent to which verb alternation classes are encoded in the embeddings of Large Pre-trained Language Models.
We find that contextual embeddings from PLMs achieve astonishingly high accuracies on tasks across most classes.
- Score: 4.243426191555036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate the extent to which verb alternation classes, as described by
Levin (1993), are encoded in the embeddings of Large Pre-trained Language
Models (PLMs) such as BERT, RoBERTa, ELECTRA, and DeBERTa using selectively
constructed diagnostic classifiers for word and sentence-level prediction
tasks. We follow and expand upon the experiments of Kann et al. (2019), which
aim to probe whether static embeddings encode frame-selectional properties of
verbs. At both the word and sentence level, we find that contextual embeddings
from PLMs not only outperform non-contextual embeddings, but achieve
astonishingly high accuracies on tasks across most alternation classes.
Additionally, we find evidence that the middle-to-upper layers of PLMs achieve
better performance on average than the lower layers across all probing tasks.
Related papers
- Data-Augmentation-Based Dialectal Adaptation for LLMs [26.72394783468532]
This report presents GMUNLP's participation to the Dialect-Copa shared task at VarDial 2024.
The task focuses on evaluating the commonsense reasoning capabilities of large language models (LLMs) on South Slavic micro-dialects.
We propose an approach that combines the strengths of different types of language models and leverages data augmentation techniques to improve task performance.
arXiv Detail & Related papers (2024-04-11T19:15:32Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - Exploring Category Structure with Contextual Language Models and Lexical
Semantic Networks [0.0]
We test a wider array of methods for probing CLMs for predicting typicality scores.
Our experiments, using BERT, show the importance of using the right type of CLM probes.
Results highlight the importance of polysemy in this task.
arXiv Detail & Related papers (2023-02-14T09:57:23Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Don't Judge a Language Model by Its Last Layer: Contrastive Learning
with Layer-Wise Attention Pooling [6.501126898523172]
Recent pre-trained language models (PLMs) achieved great success on many natural language processing tasks through learning linguistic features and contextualized sentence representation.
This paper introduces the attention-based pooling strategy, which enables the model to preserve layer-wise signals captured in each layer and learn digested linguistic features for downstream tasks.
arXiv Detail & Related papers (2022-09-13T13:09:49Z) - Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs.
In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Toward Better Storylines with Sentence-Level Language Models [54.91921545103256]
We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.
We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task.
arXiv Detail & Related papers (2020-05-11T16:54:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.