Using Paraphrases to Study Properties of Contextual Embeddings
- URL: http://arxiv.org/abs/2207.05553v1
- Date: Tue, 12 Jul 2022 14:22:05 GMT
- Title: Using Paraphrases to Study Properties of Contextual Embeddings
- Authors: Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea
- Abstract summary: We use paraphrases as a unique source of data to analyze contextualized embeddings.
Because paraphrases naturally encode consistent word and phrase semantics, they provide a unique lens for investigating properties of embeddings.
We find that contextual embeddings effectively handle polysemous words, but give synonyms surprisingly different representations in many cases.
- Score: 46.84861591608146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We use paraphrases as a unique source of data to analyze contextualized
embeddings, with a particular focus on BERT. Because paraphrases naturally
encode consistent word and phrase semantics, they provide a unique lens for
investigating properties of embeddings. Using the Paraphrase Database's
alignments, we study words within paraphrases as well as phrase
representations. We find that contextual embeddings effectively handle
polysemous words, but give synonyms surprisingly different representations in
many cases. We confirm previous findings that BERT is sensitive to word order,
but find slightly different patterns than prior work in terms of the level of
contextualization across BERT's layers.
Related papers
- Span-Aggregatable, Contextualized Word Embeddings for Effective Phrase Mining [0.22499166814992438]
We show that when target phrases reside inside noisy context, representing the full sentence with a single dense vector is not sufficient for effective phrase retrieval.
We show that this technique is much more effective for phrase mining, yet requires considerable compute to obtain useful span representations.
arXiv Detail & Related papers (2024-05-12T12:08:05Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Divide and Conquer: Text Semantic Matching with Disentangled Keywords
and Intents [19.035917264711664]
We propose a training strategy for text semantic matching by disentangling keywords from intents.
Our approach can be easily combined with pre-trained language models (PLM) without influencing their inference efficiency.
arXiv Detail & Related papers (2022-03-06T07:48:24Z) - Pretraining without Wordpieces: Learning Over a Vocabulary of Millions
of Words [50.11559460111882]
We explore the possibility of developing BERT-style pretrained model over a vocabulary of words instead of wordpieces.
Results show that, compared to standard wordpiece-based BERT, WordBERT makes significant improvements on cloze test and machine reading comprehension.
Since the pipeline is language-independent, we train WordBERT for Chinese language and obtain significant gains on five natural language understanding datasets.
arXiv Detail & Related papers (2022-02-24T15:15:48Z) - Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to
Corpus Exploration [25.159601117722936]
We propose a contrastive fine-tuning objective that enables BERT to produce more powerful phrase embeddings.
Our approach relies on a dataset of diverse phrasal paraphrases, which is automatically generated using a paraphrase generation model.
As a case study, we show that Phrase-BERT embeddings can be easily integrated with a simple autoencoder to build a phrase-based neural topic model.
arXiv Detail & Related papers (2021-09-13T20:31:57Z) - Improving Paraphrase Detection with the Adversarial Paraphrasing Task [0.0]
Paraphrasing datasets currently rely on a sense of paraphrase based on word overlap and syntax.
We introduce a new adversarial method of dataset creation for paraphrase identification: the Adversarial Paraphrasing Task (APT)
APT asks participants to generate semantically equivalent (in the sense of mutually implicative) but lexically and syntactically disparate paraphrases.
arXiv Detail & Related papers (2021-06-14T18:15:20Z) - UCPhrase: Unsupervised Context-aware Quality Phrase Tagging [63.86606855524567]
UCPhrase is a novel unsupervised context-aware quality phrase tagger.
We induce high-quality phrase spans as silver labels from consistently co-occurring word sequences.
We show that our design is superior to state-of-the-art pre-trained, unsupervised, and distantly supervised methods.
arXiv Detail & Related papers (2021-05-28T19:44:24Z) - SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices.
We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z) - Does BERT Understand Sentiment? Leveraging Comparisons Between
Contextual and Non-Contextual Embeddings to Improve Aspect-Based Sentiment
Models [0.0]
We show that training a comparison of a contextual embedding from BERT and a generic word embedding can be used to infer sentiment.
We also show that if we finetune a subset of weights the model built on comparison of BERT and generic word embedding, it can get state of the art results for Polarity Detection in Aspect Based Sentiment Classification datasets.
arXiv Detail & Related papers (2020-11-23T19:12:31Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.