MWE as WSD: Solving Multiword Expression Identification with Word Sense
Disambiguation
- URL: http://arxiv.org/abs/2303.06623v2
- Date: Thu, 19 Oct 2023 03:31:53 GMT
- Title: MWE as WSD: Solving Multiword Expression Identification with Word Sense
Disambiguation
- Authors: Joshua Tanner and Jacob Hoffman
- Abstract summary: Recent approaches to word sense disambiguation (WSD) utilize encodings of the sense gloss (definition) to improve performance.
In this work we demonstrate that this approach can be adapted for use in multiword expression (MWE) identification by training models which use gloss and context information.
Our approach substantially improves precision, outperforming the state-of-the-art in MWE identification on the DiMSUM dataset by up to 1.9 F1 points and achieving competitive results on the PARSEME 1.1 English dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent approaches to word sense disambiguation (WSD) utilize encodings of the
sense gloss (definition), in addition to the input context, to improve
performance. In this work we demonstrate that this approach can be adapted for
use in multiword expression (MWE) identification by training models which use
gloss and context information to filter MWE candidates produced by a rule-based
extraction pipeline. Our approach substantially improves precision,
outperforming the state-of-the-art in MWE identification on the DiMSUM dataset
by up to 1.9 F1 points and achieving competitive results on the PARSEME 1.1
English dataset. Our models also retain most of their WSD performance, showing
that a single model can be used for both tasks. Finally, building on similar
approaches using Bi-encoders for WSD, we introduce a novel Poly-encoder
architecture which improves MWE identification performance.
Related papers
- Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic
Lexical Resources [11.257738983764499]
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks.
We enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains.
We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks.
arXiv Detail & Related papers (2024-02-20T13:47:51Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Co-Driven Recognition of Semantic Consistency via the Fusion of
Transformer and HowNet Sememes Knowledge [6.184249194474601]
This paper proposes a co-driven semantic consistency recognition method based on the fusion of Transformer and HowNet sememes knowledge.
BiLSTM is exploited to encode the conceptual semantic information and infer the semantic consistency.
arXiv Detail & Related papers (2023-02-21T09:53:19Z) - Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC [0.9543943371833467]
This paper describes the dma submission to the TempoWiC task, which achieves a macro-F1 score of 77.05%.
For further improvement, we integrate POS information and word semantic representation using a Mixture-of-Experts (MoE) approach.
arXiv Detail & Related papers (2022-11-07T11:28:34Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Exploring Multi-Modal Representations for Ambiguity Detection &
Coreference Resolution in the SIMMC 2.0 Challenge [60.616313552585645]
We present models for effective Ambiguity Detection and Coreference Resolution in Conversational AI.
Specifically, we use TOD-BERT and LXMERT based models, compare them to a number of baselines and provide ablation experiments.
Our results show that (1) language models are able to exploit correlations in the data to detect ambiguity; and (2) unimodal coreference resolution models can avoid the need for a vision component.
arXiv Detail & Related papers (2022-02-25T12:10:02Z) - Meta-Learning with Variational Semantic Memory for Word Sense
Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting.
Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork.
We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Referring Image Segmentation via Cross-Modal Progressive Comprehension [94.70482302324704]
Referring image segmentation aims at segmenting the foreground masks of the entities that can well match the description given in the natural language expression.
Previous approaches tackle this problem using implicit feature interaction and fusion between visual and linguistic modalities.
We propose a Cross-Modal Progressive (CMPC) module and a Text-Guided Feature Exchange (TGFE) module to effectively address the challenging task.
arXiv Detail & Related papers (2020-10-01T16:02:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.