Disentangling Semantics and Syntax in Sentence Embeddings with
Pre-trained Language Models
- URL: http://arxiv.org/abs/2104.05115v1
- Date: Sun, 11 Apr 2021 21:34:46 GMT
- Title: Disentangling Semantics and Syntax in Sentence Embeddings with
Pre-trained Language Models
- Authors: James Y. Huang, Kuan-Hao Huang, Kai-Wei Chang
- Abstract summary: ParaBART is a semantic sentence embedding model that learns to disentangle semantics and syntax in sentence embeddings obtained by pre-trained language models.
ParaBART is trained to perform syntax-guided paraphrasing, based on a source sentence that shares semantics with the target paraphrase, and a parse tree that specifies the target syntax.
- Score: 32.003787396501075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models have achieved huge success on a wide range of NLP
tasks. However, contextual representations from pre-trained models contain
entangled semantic and syntactic information, and therefore cannot be directly
used to derive useful semantic sentence embeddings for some tasks. Paraphrase
pairs offer an effective way of learning the distinction between semantics and
syntax, as they naturally share semantics and often vary in syntax. In this
work, we present ParaBART, a semantic sentence embedding model that learns to
disentangle semantics and syntax in sentence embeddings obtained by pre-trained
language models. ParaBART is trained to perform syntax-guided paraphrasing,
based on a source sentence that shares semantics with the target paraphrase,
and a parse tree that specifies the target syntax. In this way, ParaBART learns
disentangled semantic and syntactic representations from their respective
inputs with separate encoders. Experiments in English show that ParaBART
outperforms state-of-the-art sentence embedding models on unsupervised semantic
similarity tasks. Additionally, we show that our approach can effectively
remove syntactic information from semantic sentence embeddings, leading to
better robustness against syntactic variation on downstream semantic tasks.
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Breaking Down Word Semantics from Pre-trained Language Models through
Layer-wise Dimension Selection [0.0]
This paper aims to disentangle semantic sense from BERT by applying a binary mask to middle outputs across the layers.
The disentangled embeddings are evaluated through binary classification to determine if the target word in two different sentences has the same meaning.
arXiv Detail & Related papers (2023-10-08T11:07:19Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - Transferring Semantic Knowledge Into Language Encoders [6.85316573653194]
We introduce semantic form mid-tuning, an approach for transferring semantic knowledge from semantic meaning representations into language encoders.
We show that this alignment can be learned implicitly via classification or directly via triplet loss.
Our method yields language encoders that demonstrate improved predictive performance across inference, reading comprehension, textual similarity, and other semantic tasks.
arXiv Detail & Related papers (2021-10-14T14:11:12Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Cross-lingual Word Sense Disambiguation using mBERT Embeddings with
Syntactic Dependencies [0.0]
Cross-lingual word sense disambiguation (WSD) tackles the challenge of disambiguating ambiguous words across languages given context.
BERT embedding model has been proven to be effective in contextual information of words.
This project investigates how syntactic information can be added into the BERT embeddings to result in both semantics- and syntax-incorporated word embeddings.
arXiv Detail & Related papers (2020-12-09T20:22:11Z) - On the Sentence Embeddings from Pre-trained Language Models [78.45172445684126]
In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited.
We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity.
We propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.
arXiv Detail & Related papers (2020-11-02T13:14:57Z) - Semantics-Aware Inferential Network for Natural Language Understanding [79.70497178043368]
We propose a Semantics-Aware Inferential Network (SAIN) to meet such a motivation.
Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues.
Our model achieves significant improvement on 11 tasks including machine reading comprehension and natural language inference.
arXiv Detail & Related papers (2020-04-28T07:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.