A Persian Benchmark for Joint Intent Detection and Slot Filling
- URL: http://arxiv.org/abs/2303.00408v1
- Date: Wed, 1 Mar 2023 10:57:21 GMT
- Title: A Persian Benchmark for Joint Intent Detection and Slot Filling
- Authors: Masoud Akbari, Amir Hossein Karimi, Tayyebeh Saeedi, Zeinab Saeidi,
Kiana Ghezelbash, Fatemeh Shamsezat, Mohammad Akbari, Ali Mohades
- Abstract summary: Natural Language Understanding (NLU) is important in today's technology as it enables machines to comprehend and process human language.
This paper highlights the significance of advancing the field of NLU for low-resource languages.
We create a Persian benchmark for joint intent detection and slot filling based on the ATIS dataset.
- Score: 3.633817600744528
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Natural Language Understanding (NLU) is important in today's technology as it
enables machines to comprehend and process human language, leading to improved
human-computer interactions and advancements in fields such as virtual
assistants, chatbots, and language-based AI systems. This paper highlights the
significance of advancing the field of NLU for low-resource languages. With
intent detection and slot filling being crucial tasks in NLU, the widely used
datasets ATIS and SNIPS have been utilized in the past. However, these datasets
only cater to the English language and do not support other languages. In this
work, we aim to address this gap by creating a Persian benchmark for joint
intent detection and slot filling based on the ATIS dataset. To evaluate the
effectiveness of our benchmark, we employ state-of-the-art methods for intent
detection and slot filling.
Related papers
- A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding.
There is no publicly available NLI corpus for the Romanian language.
We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - Improving Natural Language Inference in Arabic using Transformer Models
and Linguistically Informed Pre-Training [0.34998703934432673]
This paper addresses the classification of Arabic text data in the field of Natural Language Processing (NLP)
To overcome this limitation, we create a dedicated data set from publicly available resources.
We find that a language-specific model (AraBERT) performs competitively with state-of-the-art multilingual approaches.
arXiv Detail & Related papers (2023-07-27T07:40:11Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Spoken Language Understanding for Conversational AI: Recent Advances and
Future Direction [5.829344935864271]
This tutorial will discuss how the joint task is set up and introduce Spoken Language Understanding/Natural Language Understanding (SLU/NLU) with Deep Learning techniques.
We will describe how the machine uses the latest NLP and Deep Learning techniques to address the joint task.
arXiv Detail & Related papers (2022-12-21T02:47:52Z) - Robotic Skill Acquisition via Instruction Augmentation with
Vision-Language Models [70.82705830137708]
We introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL)
We utilize semi-language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data.
DIAL enables imitation learning policies to acquire new capabilities and generalize to 60 novel instructions unseen in the original dataset.
arXiv Detail & Related papers (2022-11-21T18:56:00Z) - Bridging the Domain Gap for Stance Detection for the Zulu language [6.509758931804479]
Existing AI based approaches for fighting misinformation in literature suggest automatic stance detection as an integral first step to success.
We propose a black-box non-intrusive method that utilizes techniques from Domain Adaptation to reduce the domain gap.
This allows us to rapidly achieve similar results for stance detection for the Zulu language, the target language in this work, as are found for English.
arXiv Detail & Related papers (2022-05-06T11:44:35Z) - Intent Classification Using Pre-Trained Embeddings For Low Resource
Languages [67.40810139354028]
Building Spoken Language Understanding systems that do not rely on language specific Automatic Speech Recognition is an important yet less explored problem in language processing.
We present a comparative study aimed at employing a pre-trained acoustic model to perform Spoken Language Understanding in low resource scenarios.
We perform experiments across three different languages: English, Sinhala, and Tamil each with different data sizes to simulate high, medium, and low resource scenarios.
arXiv Detail & Related papers (2021-10-18T13:06:59Z) - Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources.
Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages.
We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z) - ParsiNLU: A Suite of Language Understanding Challenges for Persian [23.26176232463948]
This work focuses on Persian language, one of the widely spoken languages in the world.
There are few NLU datasets available for this rich language.
ParsiNLU is the first benchmark in Persian language that includes a range of high-level tasks.
arXiv Detail & Related papers (2020-12-11T06:31:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.