Learning Universal Representations from Word to Sentence
- URL: http://arxiv.org/abs/2009.04656v1
- Date: Thu, 10 Sep 2020 03:53:18 GMT
- Title: Learning Universal Representations from Word to Sentence
- Authors: Yian Li, Hai Zhao
- Abstract summary: This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
- Score: 89.82415322763475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the well-developed cut-edge representation learning for language,
most language representation models usually focus on specific level of
linguistic unit, which cause great inconvenience when being confronted with
handling multiple layers of linguistic objects in a unified way. Thus this work
introduces and explores the universal representation learning, i.e., embeddings
of different levels of linguistic unit in a uniform vector space through a
task-independent evaluation. We present our approach of constructing analogy
datasets in terms of words, phrases and sentences and experiment with multiple
representation models to examine geometric properties of the learned vector
space. Then we empirically verify that well pre-trained Transformer models
incorporated with appropriate training settings may effectively yield universal
representation. Especially, our implementation of fine-tuning ALBERT on NLI and
PPDB datasets achieves the highest accuracy on analogy tasks in different
language levels. Further experiments on the insurance FAQ task show
effectiveness of universal representation models in real-world applications.
Related papers
- Investigating semantic subspaces of Transformer sentence embeddings
through linear structural probing [2.5002227227256864]
We present experiments with semantic structural probing, a method for studying sentence-level representations.
We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks.
We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
arXiv Detail & Related papers (2023-10-18T12:32:07Z) - Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings.
We demonstrate that this framework enables effective generalization across different environments.
For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z) - Pre-training Universal Language Representation [46.51685959045527]
This work introduces universal language representation learning, i.e., embeddings of different levels of linguistic units or text with quite diverse lengths in a uniform vector space.
We empirically verify that well designed pre-training scheme may effectively yield universal language representation.
arXiv Detail & Related papers (2021-05-30T09:29:01Z) - High-dimensional distributed semantic spaces for utterances [0.2907403645801429]
This paper describes a model for high-dimensional representation for utterance and text level data.
It is based on a mathematically principled and behaviourally plausible approach to representing linguistic information.
The paper shows how the implemented model is able to represent a broad range of linguistic features in a common integral framework of fixed dimensionality.
arXiv Detail & Related papers (2021-04-01T12:09:47Z) - BURT: BERT-inspired Universal Representation from Learning Meaningful
Segment [46.51685959045527]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present a universal representation model, BURT, to encode different levels of linguistic unit into the same vector space.
Specifically, we extract and mask meaningful segments based on point-wise mutual information (PMI) to incorporate different granular objectives into the pre-training stage.
arXiv Detail & Related papers (2020-12-28T16:02:28Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.