Related papers: Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task

Related papers

ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks [10.679081563761793]
This paper describes Elyadata & LIA's joint submission to the NADI multi-dialectal Arabic Speech Processing 2025.<n>Our submission ranked first for the ADI subtask and second for the multi-dialectal Arabic ASR subtask among all participants.
arXiv Detail & Related papers (2025-11-13T08:44:39Z)
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models [54.10223256792762]
We present DialectalArabicMMLU, a new benchmark for evaluating the performance of large language models (LLMs) across Arabic dialects.<n>We extend the MMLU-Redux framework through manual translation and adaptation of 3K multiple-choice question-answer pairs into five major dialects.
arXiv Detail & Related papers (2025-10-31T15:17:06Z)
The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties [107.57160730151975]
We construct a new test suite that consists of data from 200+ languages, accents, and dialects to evaluate SOTA multilingual speech models.<n>The best-performing submission achieved an absolute improvement in LID accuracy of 23% and a reduction in CER of 18%.<n>On accented and dialectal data, the best submission obtained 30.2% lower CER and 15.7% higher LID accuracy.
arXiv Detail & Related papers (2025-09-08T18:42:36Z)
LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models [59.0256377330646]
Lens is a benchmark with 3.4K contemporary images and 60K+ human-authored questions covering eight tasks and 12 daily scenarios.<n>This dataset intrinsically supports to evaluate MLLMs to handle image-invariable prompts, from basic perception to compositional reasoning.<n>We evaluate 15+ frontier MLLMs such as Qwen2.5-VL-72B, InternVL3-78B, GPT-4o and two reasoning models QVQ-72B-preview and Kimi-VL.
arXiv Detail & Related papers (2025-05-21T15:06:59Z)
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks [112.7791602217381]
We present Dynamic-SUPERB Phase-2, an open benchmark for the comprehensive evaluation of instruction-based universal speech models. Building upon the first generation, this second version incorporates 125 new tasks, expanding the benchmark to a total of 180 tasks. Evaluation results indicate that none of the models performed well universally.
arXiv Detail & Related papers (2024-11-08T06:33:22Z)
NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task [28.40134178913119]
We describe the findings of the fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024) NADI 2024 targeted both dialect identification cast as a multi-label task and identification of the Arabic level of dialectness. Winning teams achieved 50.57 Ftextsubscript1 on Subtask1, 0.1403 RMSE for Subtask2, and 20.44 BLEU in Subtask3, respectively.
arXiv Detail & Related papers (2024-07-06T01:18:58Z)
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages [64.10040374077994]
We introduce SEACrowd, a collaborative initiative that consolidates standardized corpora in nearly 1,000 languages across three modalities. We assess the quality of AI models on 36 indigenous languages across 13 tasks, offering valuable insights into the current AI landscape in SEA.
arXiv Detail & Related papers (2024-06-14T15:23:39Z)
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic [51.922112625469836]
We present datasetname, the first multi-task language understanding benchmark for the Arabic language. Our data comprises 40 tasks and 14,575 multiple-choice questions in Modern Standard Arabic (MSA) and is carefully constructed by collaborating with native speakers in the region. Our evaluations of 35 models reveal substantial room for improvement, particularly among the best open-source models.
arXiv Detail & Related papers (2024-02-20T09:07:41Z)
LyricSIM: A novel Dataset and Benchmark for Similarity Detection in Spanish Song LyricS [52.77024349608834]
We present a new dataset and benchmark tailored to the task of semantic similarity in song lyrics. Our dataset, originally consisting of 2775 pairs of Spanish songs, was annotated in a collective annotation experiment by 63 native annotators.
arXiv Detail & Related papers (2023-06-02T07:48:20Z)
Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT [0.0]
This paper presents our approach to address the EACL WANLP-2021 Shared Task 1: Nuanced Arabic Dialect Identification (NADI) The task is aimed at developing a system that identifies the geographical location(country/province) from where an Arabic tweet in the form of modern standard Arabic or dialect comes from.
arXiv Detail & Related papers (2021-02-19T05:39:21Z)
Arabic Speech Recognition by End-to-End, Modular Systems and Human [56.96327247226586]
We perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition. For ASR the end-to-end work led to 12.5%, 27.5%, 23.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.6% on average.
arXiv Detail & Related papers (2021-01-21T05:55:29Z)
Arabic Dialect Identification Using BERT-Based Domain Adaptation [0.0]
Arabic is one of the most important and growing languages in the world. With the rise of social media platforms such as Twitter, Arabic spoken dialects have become more in use.
arXiv Detail & Related papers (2020-11-13T15:52:51Z)
Multi-Dialect Arabic BERT for Country-Level Dialect Identification [1.2928709656541642]
We present the experiments conducted, and the models developed by our competing team, Mawdoo3 AI. The dialect identification subtask provides 21,000 country-level labeled tweets covering all 21 Arab countries. We publicly release the pre-trained language model component of our winning solution under the name of Multi-dialect-Arabic-BERT model.
arXiv Detail & Related papers (2020-07-10T21:11:46Z)
Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset [73.66530509749305]
This paper presents an exploration of end-to-end automatic speech recognition systems (ASR) for the largest open-source Russian language data set -- OpenSTT. We evaluate different existing end-to-end approaches such as joint CTC/Attention, RNN-Transducer, and Transformer. For the three available validation sets (phone calls, YouTube, and books), our best end-to-end model achieves word error rate (WER) of 34.8%, 19.1%, and 18.1%, respectively.
arXiv Detail & Related papers (2020-06-15T10:35:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.