A Comparative Study of Lexical Substitution Approaches based on Neural
Language Models
- URL: http://arxiv.org/abs/2006.00031v1
- Date: Fri, 29 May 2020 18:43:22 GMT
- Title: A Comparative Study of Lexical Substitution Approaches based on Neural
Language Models
- Authors: Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, and Alexander
Panchenko
- Abstract summary: We present a large-scale comparative study of popular neural language and masked language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
- Score: 117.96628873753123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lexical substitution in context is an extremely powerful technology that can
be used as a backbone of various NLP applications, such as word sense
induction, lexical relation extraction, data augmentation, etc. In this paper,
we present a large-scale comparative study of popular neural language and
masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, XLNet,
applied to the task of lexical substitution. We show that already competitive
results achieved by SOTA LMs/MLMs can be further improved if information about
the target word is injected properly, and compare several target injection
methods. In addition, we provide analysis of the types of semantic relations
between the target and substitutes generated by different models providing
insights into what kind of words are really generated or given by annotators as
substitutes.
Related papers
- Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic
Lexical Resources [11.257738983764499]
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks.
We enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains.
We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks.
arXiv Detail & Related papers (2024-02-20T13:47:51Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Synonym Detection Using Syntactic Dependency And Neural Embeddings [3.0770051635103974]
We study the role of syntactic dependencies in deriving distributional semantics using the Vector Space Model.
We study the effectiveness of injecting human-compiled semantic knowledge into neural embeddings on computing distributional similarity.
Our results show that the syntactically conditioned contexts can interpret lexical semantics better than the unconditioned ones.
arXiv Detail & Related papers (2022-09-30T03:16:41Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs.
In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z) - Reranking Machine Translation Hypotheses with Structured and Web-based
Language Models [11.363601836199331]
Two structured language models are applied for N-best rescoring.
We find that the combination of these language models increases the BLEU score up to 1.6% absolutely on blind test sets.
arXiv Detail & Related papers (2021-04-25T22:09:03Z) - Adversarial Subword Regularization for Robust Neural Machine Translation [23.968624881678913]
Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation.
We present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations.
arXiv Detail & Related papers (2020-04-29T12:06:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.