A Sequence-to-Sequence Approach for Arabic Pronoun Resolution
- URL: http://arxiv.org/abs/2305.11529v1
- Date: Fri, 19 May 2023 08:53:41 GMT
- Title: A Sequence-to-Sequence Approach for Arabic Pronoun Resolution
- Authors: Hanan S. Murayshid, Hafida Benhidour, Said Kerrache
- Abstract summary: This paper proposes a sequence-to-sequence learning approach for Arabic pronoun resolution.
The proposed approach is evaluated on the AnATAr dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a sequence-to-sequence learning approach for Arabic
pronoun resolution, which explores the effectiveness of using advanced natural
language processing (NLP) techniques, specifically Bi-LSTM and the BERT
pre-trained Language Model, in solving the pronoun resolution problem in
Arabic. The proposed approach is evaluated on the AnATAr dataset, and its
performance is compared to several baseline models, including traditional
machine learning models and handcrafted feature-based models. Our results
demonstrate that the proposed model outperforms the baseline models, which
include KNN, logistic regression, and SVM, across all metrics. In addition, we
explore the effectiveness of various modifications to the model, including
concatenating the anaphor text beside the paragraph text as input, adding a
mask to focus on candidate scores, and filtering candidates based on gender and
number agreement with the anaphor. Our results show that these modifications
significantly improve the model's performance, achieving up to 81% on MRR and
71% for F1 score while also demonstrating higher precision, recall, and
accuracy. These findings suggest that the proposed model is an effective
approach to Arabic pronoun resolution and highlights the potential benefits of
leveraging advanced NLP neural models.
Related papers
- Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition [4.059708117119894]
This study addresses whether explicit language models, specifically n-gram models, still contribute to the performance of state-of-the-art deep learning architectures in the field of handwriting recognition.
We evaluate two prominent neural network architectures, PyLaia and DAN, with and without the integration of explicit n-gram language models.
The results show that incorporating character or subword n-gram models significantly improves the performance of ATR models on all datasets.
arXiv Detail & Related papers (2024-04-30T07:37:48Z) - ArabianGPT: Native Arabic GPT-based Large Language Model [2.8623940003518156]
This paper proposes ArabianGPT, a series of transformer-based models within the ArabianLLM suite designed explicitly for Arabic.
The AraNizer tokenizer, integral to these models, addresses the unique morphological aspects of Arabic script.
For sentiment analysis, the fine-tuned ArabianGPT-0.1B model achieved a remarkable accuracy of 95%, a substantial increase from the base model's 56%.
arXiv Detail & Related papers (2024-02-23T13:32:47Z) - Large Language Models as Annotators: Enhancing Generalization of NLP
Models at Minimal Cost [6.662800021628275]
We study the use of large language models (LLMs) for annotating inputs and improving the generalization of NLP models.
We propose a sampling strategy based on the difference in prediction scores between the base model and the finetuned NLP model.
arXiv Detail & Related papers (2023-06-27T19:29:55Z) - Rethinking Masked Language Modeling for Chinese Spelling Correction [70.85829000570203]
We study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model.
We find that fine-tuning BERT tends to over-fit the error model while under-fit the language model, resulting in poor generalization to out-of-distribution error patterns.
We demonstrate that a very simple strategy, randomly masking 20% non-error tokens from the input sequence during fine-tuning is sufficient for learning a much better language model without sacrificing the error model.
arXiv Detail & Related papers (2023-05-28T13:19:12Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - ANNA: Enhanced Language Representation for Question Answering [5.713808202873983]
We show how approaches affect performance individually and that the approaches are jointly considered in pre-training models.
We propose an extended pre-training task, and a new neighbor-aware mechanism that attends neighboring tokens more to capture the richness of context for pre-training language modeling.
Our best model achieves new state-of-the-art results of 95.7% F1 and 90.6% EM on SQuAD 1.1 and also outperforms existing pre-trained language models such as RoBERTa, ALBERT, ELECTRA, and XLNet.
arXiv Detail & Related papers (2022-03-28T05:26:52Z) - Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction.
It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition.
We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.