Related papers: A Comparative Study of Transformers on Word Sense Disambiguation

A Comparative Study of Transformers on Word Sense Disambiguation

URL: http://arxiv.org/abs/2111.15417v1
Date: Tue, 30 Nov 2021 14:10:22 GMT
Title: A Comparative Study of Transformers on Word Sense Disambiguation
Authors: Avi Chawla and Nidhi Mulay and Vikas Bishnoi and Gaurav Dhama and Dr. Anil Kumar Singh
Abstract summary: This paper presents a comparative study on the contextualization power of neural network-based embedding systems. We evaluate their contextualization power using two sample Word Sense Disambiguation (WSD) tasks, SensEval-2 and SensEval-3. Experimental results show that the proposed techniques also achieve superior results over the current state-of-the-art on both the WSD tasks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent years of research in Natural Language Processing (NLP) have witnessed dramatic growth in training large models for generating context-aware language representations. In this regard, numerous NLP systems have leveraged the power of neural network-based architectures to incorporate sense information in embeddings, resulting in Contextualized Word Embeddings (CWEs). Despite this progress, the NLP community has not witnessed any significant work performing a comparative study on the contextualization power of such architectures. This paper presents a comparative study and an extensive analysis of nine widely adopted Transformer models. These models are BERT, CTRL, DistilBERT, OpenAI-GPT, OpenAI-GPT2, Transformer-XL, XLNet, ELECTRA, and ALBERT. We evaluate their contextualization power using two lexical sample Word Sense Disambiguation (WSD) tasks, SensEval-2 and SensEval-3. We adopt a simple yet effective approach to WSD that uses a k-Nearest Neighbor (kNN) classification on CWEs. Experimental results show that the proposed techniques also achieve superior results over the current state-of-the-art on both the WSD tasks

Related papers

Advancements in Natural Language Processing: Exploring Transformer-Based Architectures for Text Understanding [10.484788943232674]
This paper explores the advancements in transformer models, such as BERT and GPT, focusing on their superior performance in text understanding tasks. The results demonstrate state-of-the-art performance on benchmarks like GLUE and SQuAD, with F1 scores exceeding 90%, though challenges such as high computational costs persist.
arXiv Detail & Related papers (2025-03-26T04:45:33Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources [11.257738983764499]
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. We enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks.
arXiv Detail & Related papers (2024-02-20T13:47:51Z)
A Cohesive Distillation Architecture for Neural Language Models [0.0]
A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size. This study investigates methods for Knowledge Distillation (KD) to provide efficient alternatives to large-scale models.
arXiv Detail & Related papers (2023-01-12T08:01:53Z)
INTERACTION: A Generative XAI Framework for Natural Language Inference Explanations [58.062003028768636]
Current XAI approaches only focus on delivering a single explanation. This paper proposes a generative XAI framework, INTERACTION (explaIn aNd predicT thEn queRy with contextuAl CondiTional varIational autO-eNcoder) Our novel framework presents explanation in two steps: (step one) Explanation and Label Prediction; and (step two) Diverse Evidence Generation.
arXiv Detail & Related papers (2022-09-02T13:52:39Z)
Exploring Dimensionality Reduction Techniques in Multilingual Transformers [64.78260098263489]
This paper gives a comprehensive account of the impact of dimensional reduction techniques on the performance of state-of-the-art multilingual Siamese Transformers. It shows that it is possible to achieve an average reduction in the number of dimensions of $91.58% pm 2.59%$ and $54.65% pm 32.20%$, respectively.
arXiv Detail & Related papers (2022-04-18T17:20:55Z)
Obtaining Better Static Word Embeddings Using Contextual Embedding Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training. As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z)
Training Bi-Encoders for Word Sense Disambiguation [4.149972584899897]
State-of-the-art approaches in Word Sense Disambiguation leverage lexical information along with pre-trained embeddings from these models to achieve results comparable to human inter-annotator agreement on standard evaluation benchmarks. We further the state of the art in Word Sense Disambiguation through our multi-stage pre-training and fine-tuning pipeline.
arXiv Detail & Related papers (2021-05-21T06:06:03Z)
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures [0.0]
Natural Language Processing models have achieved phenomenal success in linguistic and semantic tasks. Recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes. Knowledge Retrievers have been built to extricate explicit data documents from a large corpus of databases with greater efficiency and accuracy.
arXiv Detail & Related papers (2021-03-23T22:38:20Z)
Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation [54.864148836486166]
We propose to incorporate the explicit syntactic and semantic structures of languages into a non-autoregressive Transformer. Our model achieves a significantly faster speed, as well as keeps the translation quality when compared with several state-of-the-art non-autoregressive models.
arXiv Detail & Related papers (2021-01-22T04:12:17Z)
A Comparative Study of Lexical Substitution Approaches based on Neural Language Models [117.96628873753123]
We present a large-scale comparative study of popular neural language and masked language models. We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
arXiv Detail & Related papers (2020-05-29T18:43:22Z)
An Evaluation of Recent Neural Sequence Tagging Models in Turkish Named Entity Recognition [5.161531917413708]
We propose a transformer-based network with a conditional random field layer that leads to the state-of-the-art result. Our study contributes to the literature that quantifies the impact of transfer learning on processing morphologically rich languages.
arXiv Detail & Related papers (2020-05-14T06:54:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.