Related papers: Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis

Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis

URL: http://arxiv.org/abs/2011.12073v1
Date: Tue, 24 Nov 2020 13:19:06 GMT
Title: Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis
Authors: Michael A. Lepori, R. Thomas McCoy
Abstract summary: We investigate the degree to which a verb embedding encodes the verb's subject, a pronoun embedding encodes the pronoun's antecedent, and a full-sentence representation encodes the sentence's head word. In all cases, we show that BERT's contextualized embeddings reflect the linguistic dependency being studied, and that BERT encodes these dependencies to a greater degree than it encodes less linguistically-salient controls.
Score: 13.016284599828232
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As the name implies, contextualized representations of language are typically motivated by their ability to encode context. Which aspects of context are captured by such representations? We introduce an approach to address this question using Representational Similarity Analysis (RSA). As case studies, we investigate the degree to which a verb embedding encodes the verb's subject, a pronoun embedding encodes the pronoun's antecedent, and a full-sentence representation encodes the sentence's head word (as determined by a dependency parse). In all cases, we show that BERT's contextualized embeddings reflect the linguistic dependency being studied, and that BERT encodes these dependencies to a greater degree than it encodes less linguistically-salient controls. These results demonstrate the ability of our approach to adjudicate between hypotheses about which aspects of context are encoded in representations of language.

Related papers

Byte BPE Tokenization as an Inverse string Homomorphism [12.885921620444272]
We show that tokenization acts as an inverse homomorphism between strings and tokens. This suggests that the character space of the source language and the token space of the tokenized language are homomorphic. We also explore the concept of proper tokenization, which refers to an unambiguous tokenization returned from the tokenizer.
arXiv Detail & Related papers (2024-12-04T09:38:11Z)
Semantics or spelling? Probing contextual word embeddings with orthographic noise [4.622165486890317]
It remains unclear exactly what information is encoded in PLM hidden states. Surprisingly, we find that CWEs generated by popular PLMs are highly sensitive to noise in input data. This suggests that CWEs capture information unrelated to word-level meaning and can be manipulated through trivial modifications of input data.
arXiv Detail & Related papers (2024-08-08T02:07:25Z)
Natural Language Decompositions of Implicit Content Enable Better Text Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account. We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed. Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z)
Transition-based Abstract Meaning Representation Parsing with Contextual Embeddings [0.0]
We study a way of combing two of the most successful routes to meaning of language--statistical language models and symbolic semantics formalisms--in the task of semantic parsing. We explore the utility of incorporating pretrained context-aware word embeddings--such as BERT and RoBERTa--in the problem of parsing.
arXiv Detail & Related papers (2022-06-13T15:05:24Z)
Latent Topology Induction for Understanding Contextualized Representations [84.7918739062235]
We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models. We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
arXiv Detail & Related papers (2022-06-03T11:22:48Z)
Do Context-Aware Translation Models Pay the Right Attention? [61.25804242929533]
Context-aware machine translation models are designed to leverage contextual information, but often fail to do so. In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words? We introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations. Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words.
arXiv Detail & Related papers (2021-05-14T17:32:24Z)
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction [61.4913233397155]
We show that BERT uses information about RC spans during agreement prediction using the linguistically strategy. We also found that counterfactual representations generated for a specific RC subtype influenced the number prediction in sentences with other RC subtypes, suggesting that information about RC boundaries was encoded abstractly in BERT's representation.
arXiv Detail & Related papers (2021-05-14T17:11:55Z)
Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT [7.057643880514415]
We investigate how Multilingual BERT (mBERT) encodes grammar by examining how the high-order grammatical feature of morphosyntactic alignment is manifested across the embedding spaces of different languages.
arXiv Detail & Related papers (2021-01-26T19:21:59Z)
Pareto Probing: Trading Off Accuracy for Complexity [87.09294772742737]
We argue for a probe metric that reflects the fundamental trade-off between probe complexity and performance. Our experiments with dependency parsing reveal a wide gap in syntactic knowledge between contextual and non-contextual representations.
arXiv Detail & Related papers (2020-10-05T17:27:31Z)
Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve [49.878051587667244]
We examine the performance of several variants of LSTM-CRF architectures for named entity recognition. We find that context representations do contribute to system performance, but that the main factor driving high performance is learning the name tokens themselves. We enlist human annotators to evaluate the feasibility of inferring entity types from the context alone and find that, while people are not able to infer the entity type either for the majority of the errors made by the context-only system, there is some room for improvement.
arXiv Detail & Related papers (2020-04-09T14:37:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.