Related papers: Towards spoken dialect identification of Irish

Towards spoken dialect identification of Irish

URL: http://arxiv.org/abs/2307.07436v1
Date: Fri, 14 Jul 2023 16:03:09 GMT
Title: Towards spoken dialect identification of Irish
Authors: Liam Lonergan, Mengjie Qian, Neasa N\'i Chiar\'ain, Christer Gobl, Ailbhe N\'i Chasaide
Abstract summary: The Irish language is rich in its diversity of dialects and accents. A recent study investigating dialect bias in Irish ASR found that performance for the Ulster dialect was consistently worse than for the Connacht or Munster dialects. The present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline.
Score: 5.1121440213561335
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Irish language is rich in its diversity of dialects and accents. This compounds the difficulty of creating a speech recognition system for the low-resource language, as such a system must contend with a high degree of variability with limited corpora. A recent study investigating dialect bias in Irish ASR found that balanced training corpora gave rise to unequal dialect performance, with performance for the Ulster dialect being consistently worse than for the Connacht or Munster dialects. Motivated by this, the present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline. Two acoustic classification models are tested, XLS-R and ECAPA-TDNN, in conjunction with a text-based classifier using a pretrained Irish-language BERT model. The ECAPA-TDNN, particularly a model pretrained for language identification on the VoxLingua107 dataset, performed best overall, with an accuracy of 73%. This was further improved to 76% by fusing the model's outputs with the text-based model. The Ulster dialect was most accurately identified, with an accuracy of 94%, however the model struggled to disambiguate between the Connacht and Munster dialects, suggesting a more nuanced approach may be necessary to robustly distinguish between the dialects of Irish.

Related papers

Should LLMs, $\ extit{like}$, Generate How Users Talk? Building Dialect-Accurate Dialog[ue]s Beyond the American Default with MDial [13.016574005932311]
More than 80% of the 1.6 billion English speakers do not use Standard American English.<n>We introduce $textbfMDial$, the first large-scale framework for generating multi-dialectal conversational data.
arXiv Detail & Related papers (2026-01-30T12:08:08Z)
DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation [111.94720088481614]
Can multimodal generative models effectively produce content given dialectal textual input?<n>We construct a new large-scale benchmark spanning six common English dialects.<n>We design a general encoder-based mitigation strategy for multimodal generative models.
arXiv Detail & Related papers (2025-10-16T17:56:55Z)
IRLBench: A Multi-modal, Culturally Grounded, Parallel Irish-English Benchmark for Open-Ended LLM Reasoning Evaluation [3.9530780161144667]
We present IRLBench, presented in parallel English and Irish.<n>Our benchmark consists of 12 representative subjects developed from the 2024 Irish Leaving Certificate exams.<n>We show that models produce valid Irish responses less than 80% of the time, and answer correctly 55.8% of the time compared to 76.2% in English for the best-performing model.
arXiv Detail & Related papers (2025-05-16T00:02:05Z)
Task-Agnostic Low-Rank Adapters for Unseen English Dialects [52.88554155235167]
Large Language Models (LLMs) are trained on corpora disproportionally weighted in favor of Standard American English. By disentangling dialect-specific and cross-dialectal information, HyperLoRA improves generalization to unseen dialects in a task-agnostic fashion.
arXiv Detail & Related papers (2023-11-02T01:17:29Z)
Towards dialect-inclusive recognition in a low-resource language: are balanced corpora the answer? [5.1121440213561335]
This study is a diagnostic to quantify the effect of the speaker's dialect on recognition performance. 12 ASR systems were trained using dialect-balanced training corpora and modified versions of the baseline corpora. Results indicate that dialect-balanced corpora do not yield a similar performance across the dialects. There is a close relationship between Co and Mu dialects, but one that is not symmetrical.
arXiv Detail & Related papers (2023-07-14T12:18:38Z)
DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules [64.93179829965072]
DADA is a modular approach to imbue SAE-trained models with multi-dialectal robustness. We show that DADA is effective for both single task and instruction fine language models.
arXiv Detail & Related papers (2023-05-22T18:43:31Z)
Multi-VALUE: A Framework for Cross-Dialectal English NLP [49.55176102659081]
Multi- Dialect is a controllable rule-based translation system spanning 50 English dialects. Stress tests reveal significant performance disparities for leading models on non-standard dialects. We partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task.
arXiv Detail & Related papers (2022-12-15T18:17:01Z)
A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition [80.87085897419982]
We propose a novel acoustic modeling technique for accurate multi-dialect speech recognition with a single AM. Our proposed AM is dynamically adapted based on both dialect information and its internal representation, which results in a highly adaptive AM for handling multiple dialects simultaneously. The experimental results on large scale speech datasets show that the proposed AM outperforms all the previous ones, reducing word error rates (WERs) by 8.11% relative compared to a single all-dialects AM and by 7.31% relative compared to dialect-specific AMs.
arXiv Detail & Related papers (2022-05-06T06:07:09Z)
Automatic Dialect Density Estimation for African American English [74.44807604000967]
We explore automatic prediction of dialect density of the African American English (AAE) dialect. dialect density is defined as the percentage of words in an utterance that contain characteristics of the non-standard dialect. We show a significant correlation between our predicted and ground truth dialect density measures for AAE speech in this database.
arXiv Detail & Related papers (2022-04-03T01:34:48Z)
English Accent Accuracy Analysis in a State-of-the-Art Automatic Speech Recognition System [3.4888132404740797]
We evaluate a state-of-the-art automatic speech recognition model, using unseen data from a corpus with a wide variety of labeled English accents. We show that there is indeed an accuracy bias in terms of accentual variety, favoring the accents most prevalent in the training corpus.
arXiv Detail & Related papers (2021-05-09T08:24:33Z)
Leveraging neural representations for facilitating access to untranscribed speech from endangered languages [10.61744395262441]
We use data selected from 7 Australian Aboriginal languages and a regional variety of Dutch. We find that representations from the middle layers of the wav2vec 2.0 Transformer offer large gains in task performance. While features extracted using the pre-trained English model yielded improved detection on all the evaluation languages, better detection performance was associated with the evaluation language's phonological similarity to English.
arXiv Detail & Related papers (2021-03-26T16:44:08Z)
Learning to Recognize Dialect Features [21.277962038423123]
We introduce the task of dialect feature detection, and present two multitask learning approaches. We train our models on a small number of minimal pairs, building on how linguists typically define dialect features.
arXiv Detail & Related papers (2020-10-23T23:25:00Z)
Unsupervised Cross-lingual Representation Learning for Speech Recognition [63.85924123692923]
XLSR learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages. We build on wav2vec 2.0 which is trained by solving a contrastive task over masked latent speech representations. Experiments show that cross-lingual pretraining significantly outperforms monolingual pretraining.
arXiv Detail & Related papers (2020-06-24T18:25:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.