Related papers: A Comparative Study of Lexical Substitution Approaches based on Neural Language Models

A Comparative Study of Lexical Substitution Approaches based on Neural Language Models

URL: http://arxiv.org/abs/2006.00031v1
Date: Fri, 29 May 2020 18:43:22 GMT
Title: A Comparative Study of Lexical Substitution Approaches based on Neural Language Models
Authors: Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, and Alexander Panchenko
Abstract summary: We present a large-scale comparative study of popular neural language and masked language models. We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
Score: 117.96628873753123
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Lexical substitution in context is an extremely powerful technology that can be used as a backbone of various NLP applications, such as word sense induction, lexical relation extraction, data augmentation, etc. In this paper, we present a large-scale comparative study of popular neural language and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, XLNet, applied to the task of lexical substitution. We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly, and compare several target injection methods. In addition, we provide analysis of the types of semantic relations between the target and substitutes generated by different models providing insights into what kind of words are really generated or given by annotators as substitutes.

Related papers

Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models [60.00178316095646]
Sentence embedding is essential for many NLP tasks, with contrastive learning methods achieving strong performance using datasets like NLI. Recent studies leverage large language models (LLMs) to generate sentence pairs, reducing annotation dependency. We propose a method for controlling the generation direction of LLMs in the latent space. Unlike unconstrained generation, the controlled approach ensures meaningful semantic divergence. Experiments on multiple benchmarks demonstrate that our method achieves new SOTA performance with a modest cost in ranking sentence synthesis.
arXiv Detail & Related papers (2025-02-19T12:07:53Z)
Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources [11.257738983764499]
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. We enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks.
arXiv Detail & Related papers (2024-02-20T13:47:51Z)
Towards Effective Disambiguation for Machine Translation with Large Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences" Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z)
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings. Our model operates on parallel data in $N$ languages. We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z)
Synonym Detection Using Syntactic Dependency And Neural Embeddings [3.0770051635103974]
We study the role of syntactic dependencies in deriving distributional semantics using the Vector Space Model. We study the effectiveness of injecting human-compiled semantic knowledge into neural embeddings on computing distributional similarity. Our results show that the syntactically conditioned contexts can interpret lexical semantics better than the unconditioned ones.
arXiv Detail & Related papers (2022-09-30T03:16:41Z)
Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models. We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z)
Multilingual Extraction and Categorization of Lexical Collocations with Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context. Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z)
Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs. In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z)
Reranking Machine Translation Hypotheses with Structured and Web-based Language Models [11.363601836199331]
Two structured language models are applied for N-best rescoring. We find that the combination of these language models increases the BLEU score up to 1.6% absolutely on blind test sets.
arXiv Detail & Related papers (2021-04-25T22:09:03Z)
Adversarial Subword Regularization for Robust Neural Machine Translation [23.968624881678913]
Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation. We present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations.
arXiv Detail & Related papers (2020-04-29T12:06:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.