Related papers: NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

URL: http://arxiv.org/abs/2103.08466v1
Date: Thu, 4 Mar 2021 04:59:37 GMT
Title: NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task
Authors: Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash
Abstract summary: This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1.1), country-level dialect identification (Subtask 1.2), and province-level sub-dialect identification (Subtask 2.1) The dataset covers a total of 100 provinces from 21 Arab countries, collected from the Twitter domain. A total of 53 teams from 23 countries registered to participate in the tasks, thus reflecting the interest of the community in this area.
Score: 20.34810224205086
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present the findings and results of the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1.1), country-level dialect identification (Subtask 1.2), province-level MSA identification (Subtask 2.1), and province-level sub-dialect identification (Subtask 2.2). The shared task dataset covers a total of 100 provinces from 21 Arab countries, collected from the Twitter domain. A total of 53 teams from 23 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 16 submissions for Subtask 1.1 from five teams, 27 submissions for Subtask 1.2 from eight teams, 12 submissions for Subtask 2.1 from four teams, and 13 Submissions for subtask 2.2 from four teams.

Related papers

NADI 2025: The First Multidialectal Arabic Speech Processing Shared Task [34.40587614887153]
We present the findings of the sixth Nuanced Arabic Dialect Identification (ADIN 2025) Shared Task.<n>It focused on Arabic speech dialect processing across three subtasks: spoken dialect identification, speech recognition, and diacritic restoration.<n>The best-performing systems achieved 79.8% accuracy on Subtask 1, 35.68/12.20 WER/CER (overall average) on Subtask 2, and 55/13 WER/CER on Subtask 3.
arXiv Detail & Related papers (2025-09-02T07:28:51Z)
GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025. The task consists of two subtasks: Monolingual (English) and Multilingual. We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z)
Findings of the IWSLT 2024 Evaluation Campaign [102.7608597658451]
The paper reports on the shared tasks organized by the 21st IWSLT Conference. The shared tasks address 7 scientific challenges in spoken language translation.
arXiv Detail & Related papers (2024-11-07T19:11:55Z)
NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task [28.40134178913119]
We describe the findings of the fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024) NADI 2024 targeted both dialect identification cast as a multi-label task and identification of the Arabic level of dialectness. Winning teams achieved 50.57 Ftextsubscript1 on Subtask1, 0.1403 RMSE for Subtask2, and 20.44 BLEU in Subtask3, respectively.
arXiv Detail & Related papers (2024-07-06T01:18:58Z)
SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z)
ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text [41.3267575540348]
We present an overview of the ArAIEval shared task, organized as part of the first Arabic 2023 conference co-located with EMNLP 2023. ArAIEval offers two tasks over Arabic text: (i) persuasion technique detection, focusing on identifying persuasion techniques in tweets and news articles, and (ii) disinformation detection in binary and multiclass setups over tweets. A total of 20 teams participated in the final evaluation phase, with 14 and 16 teams participating in Tasks 1 and 2, respectively.
arXiv Detail & Related papers (2023-11-06T15:21:19Z)
NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task [28.986040897360336]
We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023) NADI 2023 targeted both dialect identification (Subtask 1) and dialect-to-MSA machine translation (Subtask 2 and Subtask 3). We describe the methods employed by the participating teams and briefly offer an outlook for NADI.
arXiv Detail & Related papers (2023-10-24T18:41:24Z)
Findings of the WMT 2022 Shared Task on Translation Suggestion [63.457874930232926]
We report the result of the first edition of the WMT shared task on Translation Suggestion. The task aims to provide alternatives for specific words or phrases given the entire documents generated by machine translation (MT) It consists two sub-tasks, namely, the naive translation suggestion and translation suggestion with hints.
arXiv Detail & Related papers (2022-11-30T03:48:36Z)
NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task [16.688997360734472]
Third Nuanced Arabic Dialect Identification Shared Task (NADI 2022) NADI 2022 targeted both dialect identification (Subtask 1) and dialectal sentiment analysis (Subtask 2) at the country level. Winning team achieved 27.06 F1 on Subtask 1 and F1=75.16 on Subtask 2.
arXiv Detail & Related papers (2022-10-18T04:31:05Z)
Overview of Abusive and Threatening Language Detection in Urdu at FIRE 2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language. We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening. For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z)
SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning [47.49596196559958]
This paper introduces the SemEval-2021 shared task 4: Reading of Abstract Meaning (ReCAM) Given a passage and the corresponding question, a participating system is expected to choose the correct answer from five candidates of abstract concepts. Subtask 1 aims to evaluate how well a system can model concepts that cannot be directly perceived in the physical world. Subtask 2 focuses on models' ability in comprehending nonspecific concepts located high in a hypernym hierarchy. Subtask 3 aims to provide some insights into models' generalizability over the two types of abstractness.
arXiv Detail & Related papers (2021-05-31T11:04:17Z)
Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT [0.0]
This paper presents our approach to address the EACL WANLP-2021 Shared Task 1: Nuanced Arabic Dialect Identification (NADI) The task is aimed at developing a system that identifies the geographical location(country/province) from where an Arabic tweet in the form of modern standard Arabic or dialect comes from.
arXiv Detail & Related papers (2021-02-19T05:39:21Z)
NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task [18.23153068720659]
We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI) Data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain. NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level.
arXiv Detail & Related papers (2020-10-21T22:14:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.