ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD
- URL: http://arxiv.org/abs/2205.09685v1
- Date: Thu, 19 May 2022 16:47:18 GMT
- Title: ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD
- Authors: Moustafa Al-Hajj, Mustafa Jarrar
- Abstract summary: This paper presents our work to fine-tune BERT models for Arabic Word Sense Disambiguation (WSD)
We constructed a dataset of labeled Arabic context-gloss pairs.
Each pair was labeled as True or False and target words in each context were identified and annotated.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Using pre-trained transformer models such as BERT has proven to be effective
in many NLP tasks. This paper presents our work to fine-tune BERT models for
Arabic Word Sense Disambiguation (WSD). We treated the WSD task as a
sentence-pair binary classification task. First, we constructed a dataset of
labeled Arabic context-gloss pairs (~167k pairs) we extracted from the Arabic
Ontology and the large lexicographic database available at Birzeit University.
Each pair was labeled as True or False and target words in each context were
identified and annotated. Second, we used this dataset for fine-tuning three
pre-trained Arabic BERT models. Third, we experimented the use of different
supervised signals used to emphasize target words in context. Our experiments
achieved promising results (accuracy of 84%) although we used a large set of
senses in the experiment.
Related papers
- MemeMind at ArAIEval Shared Task: Spotting Persuasive Spans in Arabic Text with Persuasion Techniques Identification [0.10120650818458249]
This paper focuses on detecting propagandistic spans and persuasion techniques in Arabic text from tweets and news paragraphs.
Our approach achieved an F1 score of 0.2774, securing the 3rd position in the leaderboard of Task 1.
arXiv Detail & Related papers (2024-08-08T15:49:01Z) - A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks [7.72751543977484]
This work investigates the effectiveness of BERT-based contextual embeddings in active learning (AL) tasks on cold-start scenarios.
Our primary contribution is the proposal of a more robust fine-tuning pipeline - DoTCAL.
Our evaluation contrasts BERT-based embeddings with other prevalent text representation paradigms, including Bag of Words (BoW), Latent Semantic Indexing (LSI) and FastText.
arXiv Detail & Related papers (2024-07-24T13:50:21Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Arabic aspect based sentiment analysis using BERT [0.0]
This article explores the modeling capabilities of contextual embeddings from pre-trained language models, such as BERT.
We are building a simple but effective BERT-based neural baseline to handle this task.
Our BERT architecture with a simple linear classification layer surpassed the state-of-the-art works, according to the experimental results.
arXiv Detail & Related papers (2021-07-28T11:34:00Z) - Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection [0.0]
We investigate whether language models sensitive to the semantics and context of words, such as Google's BERT, may be useful to overcome this adversarial attack.
Using a dataset of 5572 SMS spam messages, we first established a baseline of detection performance.
Then, we built a thesaurus of the vocabulary contained in these messages, and set up a Mad-lib attack experiment.
We found that the classic models achieved a 94% Balanced Accuracy (BA) in the original dataset, whereas the BERT model obtained 96%.
arXiv Detail & Related papers (2021-07-13T21:17:57Z) - R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic
Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching.
We first employ BERT to encode the input sentences from a global perspective.
Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Does Chinese BERT Encode Word Structure? [17.836131968160917]
Contextualized representations give significantly improved results for a wide range of NLP tasks.
Much work has been dedicated to analyzing the features captured by representative models such as BERT.
We investigate Chinese BERT using both attention weight distribution statistics and probing tasks, finding that (1) word information is captured by BERT; (2) word-level features are mostly in the middle representation layers; (3) downstream tasks make different use of word features in BERT.
arXiv Detail & Related papers (2020-10-15T12:40:56Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z) - Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence.
We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.