Contextual Lensing of Universal Sentence Representations
- URL: http://arxiv.org/abs/2002.08866v1
- Date: Thu, 20 Feb 2020 17:06:27 GMT
- Title: Contextual Lensing of Universal Sentence Representations
- Authors: Jamie Kiros
- Abstract summary: We propose Contextual Lensing, a methodology for inducing context-oriented universal sentence vectors.
We show that it is possible to focus notions of language similarity into a small number of lens parameters given a core universal matrix representation.
- Score: 4.847980206213336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: What makes a universal sentence encoder universal? The notion of a generic
encoder of text appears to be at odds with the inherent contextualization and
non-permanence of language use in a dynamic world. However, mapping sentences
into generic fixed-length vectors for downstream similarity and retrieval tasks
has been fruitful, particularly for multilingual applications. How do we manage
this dilemma? In this work we propose Contextual Lensing, a methodology for
inducing context-oriented universal sentence vectors. We break the construction
of universal sentence vectors into a core, variable length, sentence matrix
representation equipped with an adaptable `lens' from which fixed-length
vectors can be induced as a function of the lens context. We show that it is
possible to focus notions of language similarity into a small number of lens
parameters given a core universal matrix representation. For example, we
demonstrate the ability to encode translation similarity of sentences across
several languages into a single weight matrix, even when the core encoder has
not seen parallel data.
Related papers
- On Affine Homotopy between Language Encoders [127.55969928213248]
We study the properties of emphaffine alignment of language encoders.
We find that while affine alignment is fundamentally an asymmetric notion of similarity, it is still informative of extrinsic similarity.
arXiv Detail & Related papers (2024-06-04T13:58:28Z) - Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic
Representations [102.05351905494277]
Sub-sentence encoder is a contrastively-learned contextual embedding model for fine-grained semantic representation of text.
We show that sub-sentence encoders keep the same level of inference cost and space complexity compared to sentence encoders.
arXiv Detail & Related papers (2023-11-07T20:38:30Z) - Lexinvariant Language Models [84.2829117441298]
Token embeddings, a mapping from discrete lexical symbols to continuous vectors, are at the heart of any language model (LM)
We study textitlexinvariantlanguage models that are invariant to lexical symbols and therefore do not need fixed token embeddings in practice.
We show that a lexinvariant LM can attain perplexity comparable to that of a standard language model, given a sufficiently long context.
arXiv Detail & Related papers (2023-05-24T19:10:46Z) - Sentence Embedding Leaks More Information than You Expect: Generative
Embedding Inversion Attack to Recover the Whole Sentence [37.63047048491312]
We propose a generative embedding inversion attack (GEIA) that aims to reconstruct input sequences based only on their sentence embeddings.
Given the black-box access to a language model, we treat sentence embeddings as initial tokens' representations and train or fine-tune a powerful decoder model to decode the whole sequences directly.
arXiv Detail & Related papers (2023-05-04T17:31:41Z) - Discrete Cosine Transform as Universal Sentence Encoder [10.355894890759377]
We use Discrete Cosine Transform (DCT) to generate universal sentence representation for different languages.
The experimental results clearly show the superior effectiveness of DCT encoding.
arXiv Detail & Related papers (2021-06-02T04:43:54Z) - A Simple Geometric Method for Cross-Lingual Linguistic Transformations
with Pre-trained Autoencoders [11.506062545971568]
Powerful sentence encoders trained for multiple languages are on the rise.
These systems are capable of embedding a wide range of linguistic properties into vector representations.
We investigate the use of a geometric mapping in embedding space to transform linguistic properties.
arXiv Detail & Related papers (2021-04-08T09:33:50Z) - Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual
Retrieval [51.60862829942932]
We present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
For sentence-level CLIR, we demonstrate that state-of-the-art performance can be achieved.
However, the peak performance is not met using the general-purpose multilingual text encoders off-the-shelf', but rather relying on their variants that have been further specialized for sentence understanding tasks.
arXiv Detail & Related papers (2021-01-21T00:15:38Z) - Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z) - Inducing Language-Agnostic Multilingual Representations [61.97381112847459]
Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world.
We examine three approaches for this: (i) re-aligning the vector spaces of target languages to a pivot source language; (ii) removing language-specific means and variances, which yields better discriminativeness of embeddings as a by-product; and (iii) increasing input similarity across languages by removing morphological contractions and sentence reordering.
arXiv Detail & Related papers (2020-08-20T17:58:56Z) - Discovering Useful Sentence Representations from Large Pretrained
Language Models [8.212920842986689]
We explore the question of whether pretrained language models can be adapted to be used as universal decoders.
For large transformer-based language models trained on vast amounts of English text, we investigate whether such representations can be easily discovered.
We present and compare three representation injection techniques for transformer-based models and three accompanying methods which map sentences to and from this representation space.
arXiv Detail & Related papers (2020-08-20T16:03:51Z) - On Learning Language-Invariant Representations for Universal Machine
Translation [33.40094622605891]
Universal machine translation aims to learn to translate between any pair of languages.
We prove certain impossibilities of this endeavour in general and prove positive results in the presence of additional (but natural) structure of data.
We believe our theoretical insights and implications contribute to the future algorithmic design of universal machine translation.
arXiv Detail & Related papers (2020-08-11T04:45:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.