Retrieval and Generative Approaches for a Pregnancy Chatbot in Nepali
with Stemmed and Non-Stemmed Data : A Comparative Study
- URL: http://arxiv.org/abs/2311.06898v1
- Date: Sun, 12 Nov 2023 17:16:46 GMT
- Title: Retrieval and Generative Approaches for a Pregnancy Chatbot in Nepali
with Stemmed and Non-Stemmed Data : A Comparative Study
- Authors: Sujan Poudel, Nabin Ghimire, Bipesh Subedi, Saugat Singh
- Abstract summary: The performance of datasets in Nepali language has been analyzed for each approach.
BERT-based pre-trained models perform well on non-stemmed data whereas scratch transformer models have better performance on stemmed data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The field of Natural Language Processing which involves the use of artificial
intelligence to support human languages has seen tremendous growth due to its
high-quality features. Its applications such as language translation, chatbots,
virtual assistants, search autocomplete, and autocorrect are widely used in
various domains including healthcare, advertising, customer service, and target
advertising. To provide pregnancy-related information a health domain chatbot
has been proposed and this work explores two different NLP-based approaches for
developing the chatbot. The first approach is a multiclass classification-based
retrieval approach using BERTbased multilingual BERT and multilingual
DistilBERT while the other approach employs a transformer-based generative
chatbot for pregnancy-related information. The performance of both stemmed and
non-stemmed datasets in Nepali language has been analyzed for each approach.
The experimented results indicate that BERT-based pre-trained models perform
well on non-stemmed data whereas scratch transformer models have better
performance on stemmed data. Among the models tested the DistilBERT model
achieved the highest training and validation accuracy and testing accuracy of
0.9165 on the retrieval-based model architecture implementation on the
non-stemmed dataset. Similarly, in the generative approach architecture
implementation with transformer 1 gram BLEU and 2 gram BLEU scores of 0.3570
and 0.1413 respectively were achieved.
Related papers
- Detecting Text Formality: A Study of Text Classification Approaches [78.11745751651708]
This work proposes the first to our knowledge systematic study of formality detection methods based on statistical, neural-based, and Transformer-based machine learning methods.
We conducted three types of experiments -- monolingual, multilingual, and cross-lingual.
The study shows the overcome of Char BiLSTM model over Transformer-based ones for the monolingual and multilingual formality classification task.
arXiv Detail & Related papers (2022-04-19T16:23:07Z) - KinyaBERT: a Morphology-aware Kinyarwanda Language Model [1.2183405753834562]
Unsupervised sub-word tokenization methods are sub-optimal at handling morphologically rich languages.
We propose a simple yet effective two-tier BERT architecture that leverages a morphological analyzer and explicitly represents morphological compositionality.
We evaluate our proposed method on the low-resource morphologically rich Kinyarwanda language, naming the proposed model architecture KinyaBERT.
arXiv Detail & Related papers (2022-03-16T08:36:14Z) - Neural Models for Offensive Language Detection [0.0]
Offensive language detection is an ever-growing natural language processing (NLP) application.
We believe contributing to improving and comparing different machine learning models to fight such harmful contents is an important and challenging goal for this thesis.
arXiv Detail & Related papers (2021-05-30T13:02:45Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z) - ParsBERT: Transformer-based Model for Persian Language Understanding [0.7646713951724012]
This paper proposes a monolingual BERT for the Persian language (ParsBERT)
It shows its state-of-the-art performance compared to other architectures and multilingual models.
ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones.
arXiv Detail & Related papers (2020-05-26T05:05:32Z) - Cross-lingual Information Retrieval with BERT [8.052497255948046]
We explore the use of the popular bidirectional language model, BERT, to model and learn the relevance between English queries and foreign-language documents.
A deep relevance matching model based on BERT is introduced and trained by finetuning a pretrained multilingual BERT model with weak supervision.
Experimental results of the retrieval of Lithuanian documents against short English queries show that our model is effective and outperforms the competitive baseline approaches.
arXiv Detail & Related papers (2020-04-24T23:32:13Z) - What the [MASK]? Making Sense of Language-Specific BERT Models [39.54532211263058]
This paper presents the current state of the art in language-specific BERT models.
Our aim is to provide an overview of the commonalities and differences between Language-language-specific BERT models and mBERT models.
arXiv Detail & Related papers (2020-03-05T20:42:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.