Analysis of French Phonetic Idiosyncrasies for Accent Recognition
- URL: http://arxiv.org/abs/2110.09179v1
- Date: Mon, 18 Oct 2021 10:50:50 GMT
- Title: Analysis of French Phonetic Idiosyncrasies for Accent Recognition
- Authors: Pierre Berjon, Avishek Nag, and Soumyabrata Dev
- Abstract summary: Differences in pronunciation, in accent and intonation of speech in general, create one of the most common problems of speech recognition.
We use traditional machine learning techniques and convolutional neural networks, and show that the classical techniques are not sufficiently efficient to solve this problem.
In this paper, we focus our attention on the French accent. We also identify its limitation by understanding the impact of French idiosyncrasies on its spectrograms.
- Score: 0.8602553195689513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speech recognition systems have made tremendous progress since the last few
decades. They have developed significantly in identifying the speech of the
speaker. However, there is a scope of improvement in speech recognition systems
in identifying the nuances and accents of a speaker. It is known that any
specific natural language may possess at least one accent. Despite the
identical word phonemic composition, if it is pronounced in different accents,
we will have sound waves, which are different from each other. Differences in
pronunciation, in accent and intonation of speech in general, create one of the
most common problems of speech recognition. If there are a lot of accents in
language we should create the acoustic model for each separately. We carry out
a systematic analysis of the problem in the accurate classification of accents.
We use traditional machine learning techniques and convolutional neural
networks, and show that the classical techniques are not sufficiently efficient
to solve this problem. Using spectrograms of speech signals, we propose a
multi-class classification framework for accent recognition. In this paper, we
focus our attention on the French accent. We also identify its limitation by
understanding the impact of French idiosyncrasies on its spectrograms.
Related papers
- Accent conversion using discrete units with parallel data synthesized from controllable accented TTS [56.18382038512251]
The goal of accent conversion (AC) is to convert speech accents while preserving content and speaker identity.
Previous methods either required reference utterances during inference, did not preserve speaker identity well, or used one-to-one systems that could only be trained for each non-native accent.
This paper presents a promising AC model that can convert many accents into native to overcome these issues.
arXiv Detail & Related papers (2024-09-30T19:52:10Z) - Literary and Colloquial Dialect Identification for Tamil using Acoustic Features [0.0]
Speech technology plays a role in preserving various dialects of a language from going extinct.
The current work proposes a way to identify two popular and broadly classified Tamil dialects.
arXiv Detail & Related papers (2024-08-27T09:00:27Z) - Accented Speech Recognition With Accent-specific Codebooks [53.288874858671576]
Speech accents pose a significant challenge to state-of-the-art automatic speech recognition (ASR) systems.
Degradation in performance across underrepresented accents is a severe deterrent to the inclusive adoption of ASR.
We propose a novel accent adaptation approach for end-to-end ASR systems using cross-attention with a trainable set of codebooks.
arXiv Detail & Related papers (2023-10-24T16:10:58Z) - Voice-preserving Zero-shot Multiple Accent Conversion [14.218374374305421]
An accent conversion system changes a speaker's accent but preserves that speaker's voice identity.
We use adversarial learning to disentangle accent dependent features while retaining other acoustic characteristics.
Our model generates audio that sound closer to the target accent and like the original speaker.
arXiv Detail & Related papers (2022-11-23T19:51:16Z) - Accented Speech Recognition under the Indian context [0.0]
Accent forms an integral part of identifying cultures, emotions, behavior'ss, etc.
People often perceive each other in a different manner due to their accent.
The accent itself can be a conveyor of status, pride, and other emotional information which can be captured through Speech itself.
arXiv Detail & Related papers (2022-09-08T12:59:14Z) - Self-Supervised Speech Representation Learning: A Review [105.1545308184483]
Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains.
Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods.
This review presents approaches for self-supervised speech representation learning and their connection to other research areas.
arXiv Detail & Related papers (2022-05-21T16:52:57Z) - Accented Speech Recognition Inspired by Human Perception [0.0]
This paper explores methods that are inspired by human perception to evaluate possible performance improvements for recognition of accented speech.
We explore four methodologies: pre-exposure to multiple accents, grapheme and phoneme-based pronunciations, dropout, and the identification of the layers in the neural network that can specifically be associated with accent modeling.
Our results indicate that methods based on human perception are promising in reducing WER and understanding how accented speech is modeled in neural networks for novel accents.
arXiv Detail & Related papers (2021-04-09T22:35:09Z) - Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and
language Models for Intent Classification [81.80311855996584]
We propose a novel intent classification framework that employs acoustic features extracted from a pretrained speech recognition system and linguistic features learned from a pretrained language model.
We achieve 90.86% and 99.07% accuracy on ATIS and Fluent speech corpus, respectively.
arXiv Detail & Related papers (2021-02-15T07:20:06Z) - Deep Discriminative Feature Learning for Accent Recognition [14.024346215923972]
We adopt Convolutional Recurrent Neural Network as front-end encoder and integrate local features using Recurrent Neural Network to make an utterance-level accent representation.
We show that our proposed network with discriminative training method is significantly ahead of the baseline system on the accent classification track in the Accented English Speech Recognition Challenge 2020.
arXiv Detail & Related papers (2020-11-25T00:46:47Z) - An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and
Separation [57.68765353264689]
Speech enhancement and speech separation are two related tasks.
Traditionally, these tasks have been tackled using signal processing and machine learning techniques.
Deep learning has been exploited to achieve strong performance.
arXiv Detail & Related papers (2020-08-21T17:24:09Z) - "Notic My Speech" -- Blending Speech Patterns With Multimedia [65.91370924641862]
We propose a view-temporal attention mechanism to model both the view dependence and the visemic importance in speech recognition and understanding.
Our proposed method outperformed the existing work by 4.99% in terms of the viseme error rate.
We show that there is a strong correlation between our model's understanding of multi-view speech and the human perception.
arXiv Detail & Related papers (2020-06-12T06:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.