Bengali Intent Classification with Generative Adversarial BERT
- URL: http://arxiv.org/abs/2312.10679v1
- Date: Sun, 17 Dec 2023 10:45:50 GMT
- Title: Bengali Intent Classification with Generative Adversarial BERT
- Authors: Mehedi Hasan, Mohammad Jahid Ibna Basher, and Md. Tanvir Rouf Shawon
- Abstract summary: We introduce BNIntent30, a comprehensive Bengali intent classification dataset containing 30 intent classes.
The dataset is excerpted and translated from the CLINIC150 dataset containing a diverse range of user intents categorized over 150 classes.
We propose a novel approach for Bengali intent classification using Generative Adversarial BERT to evaluate the proposed dataset, which we call GAN-BnBERT.
- Score: 0.24578723416255746
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Intent classification is a fundamental task in natural language
understanding, aiming to categorize user queries or sentences into predefined
classes to understand user intent. The most challenging aspect of this
particular task lies in effectively incorporating all possible classes of
intent into a dataset while ensuring adequate linguistic variation. Plenty of
research has been conducted in the related domains in rich-resource languages
like English. In this study, we introduce BNIntent30, a comprehensive Bengali
intent classification dataset containing 30 intent classes. The dataset is
excerpted and translated from the CLINIC150 dataset containing a diverse range
of user intents categorized over 150 classes. Furthermore, we propose a novel
approach for Bengali intent classification using Generative Adversarial BERT to
evaluate the proposed dataset, which we call GAN-BnBERT. Our approach leverages
the power of BERT-based contextual embeddings to capture salient linguistic
features and contextual information from the text data, while the generative
adversarial network (GAN) component complements the model's ability to learn
diverse representations of existing intent classes through generative modeling.
Our experimental results demonstrate that the GAN-BnBERT model achieves
superior performance on the newly introduced BNIntent30 dataset, surpassing the
existing Bi-LSTM and the stand-alone BERT-based classification model.
Related papers
- Universal Cross-Lingual Text Classification [0.3958317527488535]
This research proposes a novel perspective on Universal Cross-Lingual Text Classification.
Our approach involves blending supervised data from different languages during training to create a universal model.
The primary goal is to enhance label and language coverage, aiming for a label set that represents a union of labels from various languages.
arXiv Detail & Related papers (2024-06-16T17:58:29Z) - Expanding the Vocabulary of BERT for Knowledge Base Construction [6.412048788884728]
"Knowledge Base Construction from Pretrained Language Models" challenge was held at International Semantic Web Conference 2023.
Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion.
We present Vocabulary Expandable BERT for knowledge base construction, which expand the language model's vocabulary while preserving semantic embeddings.
arXiv Detail & Related papers (2023-10-12T12:52:46Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Large Language Model as Attributed Training Data Generator: A Tale of
Diversity and Bias [92.41919689753051]
Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks.
We investigate training data generation with diversely attributed prompts, which have the potential to yield diverse and attributed generated data.
We show that attributed prompts outperform simple class-conditional prompts in terms of the resulting model's performance.
arXiv Detail & Related papers (2023-06-28T03:31:31Z) - Tri-level Joint Natural Language Understanding for Multi-turn
Conversational Datasets [5.3361357265365035]
We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.
We evaluate our model on two multi-turn datasets for which we are the first to conduct joint slot-filling and intent detection.
arXiv Detail & Related papers (2023-05-28T13:59:58Z) - DeepStruct: Pretraining of Language Models for Structure Prediction [64.84144849119554]
We pretrain language models on a collection of task-agnostic corpora to generate structures from text.
Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks.
We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets.
arXiv Detail & Related papers (2022-05-21T00:58:22Z) - Deep Learning for Bias Detection: From Inception to Deployment [4.51073220028236]
We propose a deep learning model with a transfer learning based language model to learn from manually tagged documents for automatically identifying bias in enterprise content.
We first pretrain a deep learning-based language-model using Wikipedia, then fine tune the model with a large unlabelled data set related with various types of enterprise content.
The trained model is thoroughly evaluated on independent datasets to ensure a general application.
arXiv Detail & Related papers (2021-10-12T13:57:54Z) - Pre-training Language Model Incorporating Domain-specific Heterogeneous Knowledge into A Unified Representation [49.89831914386982]
We propose a unified pre-trained language model (PLM) for all forms of text, including unstructured text, semi-structured text, and well-structured text.
Our approach outperforms the pre-training of plain text using only 1/4 of the data.
arXiv Detail & Related papers (2021-09-02T16:05:24Z) - ALICE: Active Learning with Contrastive Natural Language Explanations [69.03658685761538]
We propose Active Learning with Contrastive Explanations (ALICE) to improve data efficiency in learning.
ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations.
It extracts knowledge from these explanations using a semantically extracted knowledge.
arXiv Detail & Related papers (2020-09-22T01:02:07Z) - Students Need More Attention: BERT-based AttentionModel for Small Data
with Application to AutomaticPatient Message Triage [65.7062363323781]
We propose a novel framework based on BioBERT (Bidirectional Representations from Transformers forBiomedical TextMining)
We introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets.
As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent.
arXiv Detail & Related papers (2020-06-22T03:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.