Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations
- URL: http://arxiv.org/abs/2010.05265v2
- Date: Thu, 11 Mar 2021 20:41:09 GMT
- Title: Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations
- Authors: Shauli Ravfogel, Yanai Elazar, Jacob Goldberger, Yoav Goldberg
- Abstract summary: We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
- Score: 62.230491683411536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextualized word representations, such as ELMo and BERT, were shown to
perform well on various semantic and syntactic tasks. In this work, we tackle
the task of unsupervised disentanglement between semantics and structure in
neural language representations: we aim to learn a transformation of the
contextualized vectors, that discards the lexical semantics, but keeps the
structural information. To this end, we automatically generate groups of
sentences which are structurally similar but semantically different, and use
metric-learning approach to learn a transformation that emphasizes the
structural component that is encoded in the vectors. We demonstrate that our
transformation clusters vectors in space by structural properties, rather than
by lexical semantics. Finally, we demonstrate the utility of our distilled
representations by showing that they outperform the original contextualized
representations in a few-shot parsing setting.
Related papers
- How well do distributed representations convey contextual lexical semantics: a Thesis Proposal [3.3585951129432323]
In this thesis, we examine the efficacy of distributed representations from modern neural networks in encoding lexical meaning.
We identify four sources of ambiguity based on the relatedness and similarity of meanings influenced by context.
We then aim to evaluate these sources by collecting or constructing multilingual datasets, leveraging various language models, and employing linguistic analysis tools.
arXiv Detail & Related papers (2024-06-02T14:08:51Z) - Contextualized word senses: from attention to compositionality [0.10878040851637999]
We propose a transparent, interpretable, and linguistically motivated strategy for encoding the contextual sense of words.
Particular attention is given to dependency relations and semantic notions such as selection preferences and paradigmatic classes.
arXiv Detail & Related papers (2023-12-01T16:04:00Z) - Bridging Continuous and Discrete Spaces: Interpretable Sentence
Representation Learning via Compositional Operations [80.45474362071236]
It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space.
We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings.
arXiv Detail & Related papers (2023-05-24T00:44:49Z) - Linear Spaces of Meanings: Compositional Structures in Vision-Language
Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs)
We first present a framework for understanding compositional structures from a geometric perspective.
We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z) - Latent Topology Induction for Understanding Contextualized
Representations [84.7918739062235]
We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models.
We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
arXiv Detail & Related papers (2022-06-03T11:22:48Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Transferring Semantic Knowledge Into Language Encoders [6.85316573653194]
We introduce semantic form mid-tuning, an approach for transferring semantic knowledge from semantic meaning representations into language encoders.
We show that this alignment can be learned implicitly via classification or directly via triplet loss.
Our method yields language encoders that demonstrate improved predictive performance across inference, reading comprehension, textual similarity, and other semantic tasks.
arXiv Detail & Related papers (2021-10-14T14:11:12Z) - Image Synthesis via Semantic Composition [74.68191130898805]
We present a novel approach to synthesize realistic images based on their semantic layouts.
It hypothesizes that for objects with similar appearance, they share similar representation.
Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations.
arXiv Detail & Related papers (2021-09-15T02:26:07Z) - Disentangling semantics in language through VAEs and a certain
architectural choice [1.8907108368038217]
We train a Variational Autoencoder to translate the sentence to a fixed number of hierarchically structured latent variables.
We show that varying the corresponding latent variables results in varying these elements in sentences, and that swapping them between couples of sentences leads to the expected partial semantic swap.
arXiv Detail & Related papers (2020-12-24T00:01:40Z) - A Self-supervised Representation Learning of Sentence Structure for
Authorship Attribution [3.5991811164452923]
We propose a self-supervised framework for learning structural representations of sentences.
We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task.
arXiv Detail & Related papers (2020-10-14T02:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.