Related papers: North S\'{a}mi Dialect Identification with Self-supervised Speech Models

North S\'{a}mi Dialect Identification with Self-supervised Speech Models

URL: http://arxiv.org/abs/2305.11864v1
Date: Fri, 19 May 2023 17:53:12 GMT
Title: North S\'{a}mi Dialect Identification with Self-supervised Speech Models
Authors: Sofoklis Kakouros and Katri Hiovain-Asikainen
Abstract summary: The North S'ami (NS) language encapsulates four primary dialectal variants that are related but have differences in their phonology, morphology, and vocabulary. We investigate an extensive set of acoustic features, including MFCCs and prosodic features, for the automatic detection of the four NS variants. Our results show that NS dialects are influenced by the state language and that the four dialects are separable, reaching high classification accuracy.
Score: 1.1548853370822343
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The North S\'{a}mi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary. The unique geopolitical location of NS speakers means that in many cases they are bilingual in S\'{a}mi as well as in the dominant state language: Norwegian, Swedish, or Finnish. This enables us to study the NS variants both with respect to the spoken state language and their acoustic characteristics. In this paper, we investigate an extensive set of acoustic features, including MFCCs and prosodic features, as well as state-of-the-art self-supervised representations, namely, XLS-R, WavLM, and HuBERT, for the automatic detection of the four NS variants. In addition, we examine how the majority state language is reflected in the dialects. Our results show that NS dialects are influenced by the state language and that the four dialects are separable, reaching high classification accuracy, especially with the XLS-R model.

Related papers

A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation [19.535404632372042]
Betthupferl is an evaluation dataset containing four hours of read speech in three dialect groups spoken in Southeast Germany.<n>We provide both dialectal and Standard German transcriptions, and analyze the linguistic differences between them.<n>We benchmark several multilingual state-of-the-art ASR models on speech translation into Standard German, and find differences between how much the output resembles the dialectal vs. standardized transcriptions.
arXiv Detail & Related papers (2025-06-03T14:02:52Z)
Literary and Colloquial Dialect Identification for Tamil using Acoustic Features [0.0]
Speech technology plays a role in preserving various dialects of a language from going extinct. The current work proposes a way to identify two popular and broadly classified Tamil dialects.
arXiv Detail & Related papers (2024-08-27T09:00:27Z)
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects [72.18753241750964]
Yorub'a is an African language with roughly 47 million speakers. Recent efforts to develop NLP technologies for African languages have focused on their standard dialects. We take steps towards bridging this gap by introducing a new high-quality parallel text and speech corpus.
arXiv Detail & Related papers (2024-06-27T22:38:04Z)
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios [76.11409260727459]
This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system. We demonstrate that the similarity in phonetics between the pre-training and target languages, as well as the language category, affects the target language's adaptation performance.
arXiv Detail & Related papers (2024-06-13T08:16:52Z)
What Do Dialect Speakers Want? A Survey of Attitudes Towards Language Technology for German Dialects [60.8361859783634]
We survey speakers of dialects and regional languages related to German. We find that respondents are especially in favour of potential NLP tools that work with dialectal input.
arXiv Detail & Related papers (2024-02-19T09:15:28Z)
Multi-VALUE: A Framework for Cross-Dialectal English NLP [49.55176102659081]
Multi- Dialect is a controllable rule-based translation system spanning 50 English dialects. Stress tests reveal significant performance disparities for leading models on non-standard dialects. We partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task.
arXiv Detail & Related papers (2022-12-15T18:17:01Z)
A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition [80.87085897419982]
We propose a novel acoustic modeling technique for accurate multi-dialect speech recognition with a single AM. Our proposed AM is dynamically adapted based on both dialect information and its internal representation, which results in a highly adaptive AM for handling multiple dialects simultaneously. The experimental results on large scale speech datasets show that the proposed AM outperforms all the previous ones, reducing word error rates (WERs) by 8.11% relative compared to a single all-dialects AM and by 7.31% relative compared to dialect-specific AMs.
arXiv Detail & Related papers (2022-05-06T06:07:09Z)
Quantifying Language Variation Acoustically with Few Resources [4.162663632560141]
Deep acoustic models might have learned linguistic information that transfers to low-resource languages. We compute pairwise pronunciation differences averaged over 10 words for over 100 individual dialects from four (regional) languages. Our results show that acoustic models outperform the (traditional) transcription-based approach without requiring phonetic transcriptions.
arXiv Detail & Related papers (2022-05-05T15:00:56Z)
Automatic Dialect Density Estimation for African American English [74.44807604000967]
We explore automatic prediction of dialect density of the African American English (AAE) dialect. dialect density is defined as the percentage of words in an utterance that contain characteristics of the non-standard dialect. We show a significant correlation between our predicted and ground truth dialect density measures for AAE speech in this database.
arXiv Detail & Related papers (2022-04-03T01:34:48Z)
Cross-lingual Low Resource Speaker Adaptation Using Phonological Features [2.8080708404213373]
We train a language-agnostic multispeaker model conditioned on a set of phonologically derived features common across different languages. With as few as 32 and 8 utterances of target speaker data, we obtain high speaker similarity scores and naturalness comparable to the corresponding literature.
arXiv Detail & Related papers (2021-11-17T12:33:42Z)
Learning to Recognize Dialect Features [21.277962038423123]
We introduce the task of dialect feature detection, and present two multitask learning approaches. We train our models on a small number of minimal pairs, building on how linguists typically define dialect features.
arXiv Detail & Related papers (2020-10-23T23:25:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.