Spying on your neighbors: Fine-grained probing of contextual embeddings
for information about surrounding words
- URL: http://arxiv.org/abs/2005.01810v1
- Date: Mon, 4 May 2020 19:34:46 GMT
- Title: Spying on your neighbors: Fine-grained probing of contextual embeddings
for information about surrounding words
- Authors: Josef Klafka and Allyson Ettinger
- Abstract summary: We introduce a suite of probing tasks that enable fine-grained testing of contextual embeddings for encoding of information about surrounding words.
We examine the popular BERT, ELMo and GPT contextual encoders and find that each of our tested information types is indeed encoded as contextual information across tokens.
We discuss implications of these results for how different types of models breakdown and prioritize word-level context information when constructing token embeddings.
- Score: 12.394077144994617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although models using contextual word embeddings have achieved
state-of-the-art results on a host of NLP tasks, little is known about exactly
what information these embeddings encode about the context words that they are
understood to reflect. To address this question, we introduce a suite of
probing tasks that enable fine-grained testing of contextual embeddings for
encoding of information about surrounding words. We apply these tasks to
examine the popular BERT, ELMo and GPT contextual encoders, and find that each
of our tested information types is indeed encoded as contextual information
across tokens, often with near-perfect recoverability-but the encoders vary in
which features they distribute to which tokens, how nuanced their distributions
are, and how robust the encoding of each feature is to distance. We discuss
implications of these results for how different types of models breakdown and
prioritize word-level context information when constructing token embeddings.
Related papers
- Learning Robust Named Entity Recognizers From Noisy Data With Retrieval Augmentation [67.89838237013078]
Named entity recognition (NER) models often struggle with noisy inputs.
We propose a more realistic setting in which only noisy text and its NER labels are available.
We employ a multi-view training framework that improves robust NER without retrieving text during inference.
arXiv Detail & Related papers (2024-07-26T07:30:41Z) - Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models [8.588056811772693]
ConPARE-LAMA is a probe consisting of 34 million distinct prompts that facilitate comparison across minimal paraphrases.
ConPARE-LAMA enables insights into the independent impact of either syntactical form or semantic information of paraphrases on the knowledge retrieval performance of PLMs.
arXiv Detail & Related papers (2024-04-02T14:35:08Z) - Text-To-KG Alignment: Comparing Current Methods on Classification Tasks [2.191505742658975]
knowledge graphs (KG) provide dense and structured representations of factual information.
Recent work has focused on creating pipeline models that retrieve information from KGs as additional context.
It is not known how current methods compare to a scenario where the aligned subgraph is completely relevant to the query.
arXiv Detail & Related papers (2023-06-05T13:45:45Z) - What Are You Token About? Dense Retrieval as Distributions Over the
Vocabulary [68.77983831618685]
We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space.
We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.
arXiv Detail & Related papers (2022-12-20T16:03:25Z) - Python Code Generation by Asking Clarification Questions [57.63906360576212]
In this work, we introduce a novel and more realistic setup for this task.
We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions.
We collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers.
arXiv Detail & Related papers (2022-12-19T22:08:36Z) - Span Classification with Structured Information for Disfluency Detection
in Spoken Utterances [47.05113261111054]
We propose a novel architecture for detecting disfluencies in transcripts from spoken utterances.
Our proposed model achieves state-of-the-art results on the widely used English Switchboard for disfluency detection.
arXiv Detail & Related papers (2022-03-30T03:22:29Z) - KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization
for Relation Extraction [111.74812895391672]
We propose a Knowledge-aware Prompt-tuning approach with synergistic optimization (KnowPrompt)
We inject latent knowledge contained in relation labels into prompt construction with learnable virtual type words and answer words.
arXiv Detail & Related papers (2021-04-15T17:57:43Z) - On the Evolution of Syntactic Information Encoded by BERT's
Contextualized Representations [11.558645364193486]
In this paper, we analyze the evolution of the embedded syntax trees along the fine-tuning process of BERT for six different tasks.
Experimental results show that the encoded information is forgotten (PoS tagging), reinforced (dependency and constituency parsing) or preserved (semantics-related tasks) in different ways along the fine-tuning process depending on the task.
arXiv Detail & Related papers (2021-01-27T15:41:09Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - A Survey on Contextual Embeddings [48.04732268018772]
Contextual embeddings assign each word a representation based on its context, capturing uses of words across varied contexts and encoding knowledge that transfers across languages.
We review existing contextual embedding models, cross-lingual polyglot pre-training, the application of contextual embeddings in downstream tasks, model compression, and model analyses.
arXiv Detail & Related papers (2020-03-16T15:22:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.