Cross-lingual Word Sense Disambiguation using mBERT Embeddings with
Syntactic Dependencies
- URL: http://arxiv.org/abs/2012.05300v1
- Date: Wed, 9 Dec 2020 20:22:11 GMT
- Title: Cross-lingual Word Sense Disambiguation using mBERT Embeddings with
Syntactic Dependencies
- Authors: Xingran Zhu
- Abstract summary: Cross-lingual word sense disambiguation (WSD) tackles the challenge of disambiguating ambiguous words across languages given context.
BERT embedding model has been proven to be effective in contextual information of words.
This project investigates how syntactic information can be added into the BERT embeddings to result in both semantics- and syntax-incorporated word embeddings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cross-lingual word sense disambiguation (WSD) tackles the challenge of
disambiguating ambiguous words across languages given context. The pre-trained
BERT embedding model has been proven to be effective in extracting contextual
information of words, and have been incorporated as features into many
state-of-the-art WSD systems. In order to investigate how syntactic information
can be added into the BERT embeddings to result in both semantics- and
syntax-incorporated word embeddings, this project proposes the concatenated
embeddings by producing dependency parse tress and encoding the relative
relationships of words into the input embeddings. Two methods are also proposed
to reduce the size of the concatenated embeddings. The experimental results
show that the high dimensionality of the syntax-incorporated embeddings
constitute an obstacle for the classification task, which needs to be further
addressed in future studies.
Related papers
- Mitigating Semantic Leakage in Cross-lingual Embeddings via Orthogonality Constraint [6.880579537300643]
Current disentangled representation learning methods suffer from semantic leakage.
We propose a novel training objective, ORthogonAlity Constraint LEarning (ORACLE)
ORACLE builds upon two components: intra-class clustering and inter-class separation.
We demonstrate that training with the ORACLE objective effectively reduces semantic leakage and enhances semantic alignment within the embedding space.
arXiv Detail & Related papers (2024-09-24T02:01:52Z) - Breaking Down Word Semantics from Pre-trained Language Models through
Layer-wise Dimension Selection [0.0]
This paper aims to disentangle semantic sense from BERT by applying a binary mask to middle outputs across the layers.
The disentangled embeddings are evaluated through binary classification to determine if the target word in two different sentences has the same meaning.
arXiv Detail & Related papers (2023-10-08T11:07:19Z) - Word Sense Induction with Knowledge Distillation from BERT [6.88247391730482]
This paper proposes a method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context.
Experiments on the contextual word similarity and sense induction tasks show that this method is superior to or competitive with state-of-the-art multi-sense embeddings.
arXiv Detail & Related papers (2023-04-20T21:05:35Z) - Multilingual Word Sense Disambiguation with Unified Sense Representation [55.3061179361177]
We propose building knowledge and supervised-based Multilingual Word Sense Disambiguation (MWSD) systems.
We build unified sense representations for multiple languages and address the annotation scarcity problem for MWSD by transferring annotations from rich-sourced languages to poorer ones.
Evaluations of SemEval-13 and SemEval-15 datasets demonstrate the effectiveness of our methodology.
arXiv Detail & Related papers (2022-10-14T01:24:03Z) - Integrating Language Guidance into Vision-based Deep Metric Learning [78.18860829585182]
We propose to learn metric spaces which encode semantic similarities as embedding space.
These spaces should be transferable to classes beyond those seen during training.
This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes.
arXiv Detail & Related papers (2022-03-16T11:06:50Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution [76.615287796753]
We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models.
This is achieved by combining contextual information with knowledge from structured lexical resources.
Our experiments show that LexSubCon outperforms previous state-of-the-art methods on LS07 and CoInCo benchmark datasets.
arXiv Detail & Related papers (2021-07-11T21:25:56Z) - Disentangling Semantics and Syntax in Sentence Embeddings with
Pre-trained Language Models [32.003787396501075]
ParaBART is a semantic sentence embedding model that learns to disentangle semantics and syntax in sentence embeddings obtained by pre-trained language models.
ParaBART is trained to perform syntax-guided paraphrasing, based on a source sentence that shares semantics with the target paraphrase, and a parse tree that specifies the target syntax.
arXiv Detail & Related papers (2021-04-11T21:34:46Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z) - Semantic Relatedness for Keyword Disambiguation: Exploiting Different
Embeddings [0.0]
We propose an approach to keyword disambiguation which grounds on a semantic relatedness between words and senses provided by an external inventory (ontology) that is not known at training time.
Experimental results show that this approach achieves results comparable with the state of the art when applied for Word Sense Disambiguation (WSD) without training for a particular domain.
arXiv Detail & Related papers (2020-02-25T16:44:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.