LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution
- URL: http://arxiv.org/abs/2107.05132v1
- Date: Sun, 11 Jul 2021 21:25:56 GMT
- Title: LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution
- Authors: George Michalopoulos, Ian McKillop, Alexander Wong, Helen Chen
- Abstract summary: We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models.
This is achieved by combining contextual information with knowledge from structured lexical resources.
Our experiments show that LexSubCon outperforms previous state-of-the-art methods on LS07 and CoInCo benchmark datasets.
- Score: 76.615287796753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lexical substitution is the task of generating meaningful substitutes for a
word in a given textual context. Contextual word embedding models have achieved
state-of-the-art results in the lexical substitution task by relying on
contextual information extracted from the replaced word within the sentence.
However, such models do not take into account structured knowledge that exists
in external lexical databases.
We introduce LexSubCon, an end-to-end lexical substitution framework based on
contextual embedding models that can identify highly accurate substitute
candidates. This is achieved by combining contextual information with knowledge
from structured lexical resources. Our approach involves: (i) introducing a
novel mix-up embedding strategy in the creation of the input embedding of the
target word through linearly interpolating the pair of the target input
embedding and the average embedding of its probable synonyms; (ii) considering
the similarity of the sentence-definition embeddings of the target word and its
proposed candidates; and, (iii) calculating the effect of each substitution in
the semantics of the sentence through a fine-tuned sentence similarity model.
Our experiments show that LexSubCon outperforms previous state-of-the-art
methods on LS07 and CoInCo benchmark datasets that are widely used for lexical
substitution tasks.
Related papers
- Analyzing Semantic Change through Lexical Replacements [2.509907053583601]
We study the effect of unexpected contexts introduced by textitlexical replacements
We propose a textitreplacement schema where a target word is substituted with lexical replacements of varying relatedness.
We are the first to evaluate the use of LLaMa for semantic change detection.
arXiv Detail & Related papers (2024-04-29T10:20:41Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Unsupervised Lexical Substitution with Decontextualised Embeddings [48.00929769805882]
We propose a new unsupervised method for lexical substitution using pre-trained language models.
Our method retrieves substitutes based on the similarity of contextualised and decontextualised word embeddings.
We conduct experiments in English and Italian, and show that our method substantially outperforms strong baselines.
arXiv Detail & Related papers (2022-09-17T03:51:47Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Enhanced word embeddings using multi-semantic representation through
lexical chains [1.8199326045904998]
We propose two novel algorithms, called Flexible Lexical Chain II and Fixed Lexical Chain II.
These algorithms combine the semantic relations derived from lexical chains, prior knowledge from lexical databases, and the robustness of the distributional hypothesis in word embeddings as building blocks forming a single system.
Our results show the integration between lexical chains and word embeddings representations sustain state-of-the-art results, even against more complex systems.
arXiv Detail & Related papers (2021-01-22T09:43:33Z) - A Neural Generative Model for Joint Learning Topics and Topic-Specific
Word Embeddings [42.87769996249732]
We propose a novel generative model to explore both local and global context for joint learning topics and topic-specific word embeddings.
The trained model maps words to topic-dependent embeddings, which naturally addresses the issue of word polysemy.
arXiv Detail & Related papers (2020-08-11T13:54:11Z) - A Comparative Study of Lexical Substitution Approaches based on Neural
Language Models [117.96628873753123]
We present a large-scale comparative study of popular neural language and masked language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
arXiv Detail & Related papers (2020-05-29T18:43:22Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.