BiPhone: Modeling Inter Language Phonetic Influences in Text
- URL: http://arxiv.org/abs/2307.03322v1
- Date: Thu, 6 Jul 2023 22:31:55 GMT
- Title: BiPhone: Modeling Inter Language Phonetic Influences in Text
- Authors: Abhirut Gupta, Ananya B. Sai, Richard Sproat, Yuri Vasilevski, James
S. Ren, Ambarish Jash, Sukhdeep S. Sodhi, and Aravindan Raghuveer
- Abstract summary: A large number of people are forced to use the Web in a language they have low literacy in due to technology asymmetries.
Written text in the second language (L2) from such users often contains a large number of errors that are influenced by their native language (L1)
We propose a method to mine phoneme confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of L1 and L2.
These confusions are then plugged into a generative model (Bi-Phone) for synthetically producing corrupted L2 text.
- Score: 12.405907573933378
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A large number of people are forced to use the Web in a language they have
low literacy in due to technology asymmetries. Written text in the second
language (L2) from such users often contains a large number of errors that are
influenced by their native language (L1). We propose a method to mine phoneme
confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of
L1 and L2. These confusions are then plugged into a generative model (Bi-Phone)
for synthetically producing corrupted L2 text. Through human evaluations, we
show that Bi-Phone generates plausible corruptions that differ across L1s and
also have widespread coverage on the Web. We also corrupt the popular language
understanding benchmark SuperGLUE with our technique (FunGLUE for Phonetically
Noised GLUE) and show that SoTA language understating models perform poorly. We
also introduce a new phoneme prediction pre-training task which helps byte
models to recover performance close to SuperGLUE. Finally, we also release the
FunGLUE benchmark to promote further research in phonetically robust language
models. To the best of our knowledge, FunGLUE is the first benchmark to
introduce L1-L2 interactions in text.
Related papers
- Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Native Language Identification with Large Language Models [60.80452362519818]
We show that GPT models are proficient at NLI classification, with GPT-4 setting a new performance record of 91.7% on the benchmark11 test set in a zero-shot setting.
We also show that unlike previous fully-supervised settings, LLMs can perform NLI without being limited to a set of known classes.
arXiv Detail & Related papers (2023-12-13T00:52:15Z) - On Bilingual Lexicon Induction with Large Language Models [81.6546357879259]
We examine the potential of the latest generation of Large Language Models for the development of bilingual lexicons.
We study 1) zero-shot prompting for unsupervised BLI and 2) few-shot in-context prompting with a set of seed translation pairs.
Our work is the first to demonstrate strong BLI capabilities of text-to-text mLLMs.
arXiv Detail & Related papers (2023-10-21T12:43:27Z) - L1-aware Multilingual Mispronunciation Detection Framework [10.15106073866792]
This paper introduces a novel multilingual MDD architecture, L1-MultiMDD, enriched with L1-aware speech representation.
An end-to-end speech encoder is trained on the input signal and its corresponding reference phoneme sequence.
Experiments demonstrate the effectiveness of the proposed L1-MultiMDD framework on both seen -- L2-ARTIC, LATIC, and AraVoiceL2v2; and unseen -- EpaDB and Speechocean762 datasets.
arXiv Detail & Related papers (2023-09-14T13:53:17Z) - The Effects of Input Type and Pronunciation Dictionary Usage in Transfer
Learning for Low-Resource Text-to-Speech [1.1852406625172218]
We compare phone labels and articulatory features as input for cross-lingual transfer learning in text-to-speech for low-resource languages (LRLs)
Experiments with FastSpeech 2 and the LRL West Frisian show that using articulatory features outperformed using phone labels in both intelligibility and naturalness.
arXiv Detail & Related papers (2023-06-01T10:42:56Z) - SLABERT Talk Pretty One Day: Modeling Second Language Acquisition with
BERT [0.0]
Cross-linguistic transfer is the influence of linguistic structure of a speaker's native language on the successful acquisition of a foreign language.
We find that NLP literature has not given enough attention to the phenomenon of negative transfer.
Our findings call for further research using our novel Transformer-based SLA models.
arXiv Detail & Related papers (2023-05-31T06:22:07Z) - Improving Automatic Speech Recognition for Non-Native English with
Transfer Learning and Language Model Decoding [6.68194398006805]
We investigate fine-tuning of a pre-trained wav2vec 2.0 model citebaevski2020wav2vec,xu2021self under a rich set of L1 and L2 training conditions.
We find that while the large self-trained wav2vec 2.0 may be internalizing sufficient decoding knowledge for clean L1 speech, this does not hold for L2 speech.
arXiv Detail & Related papers (2022-02-10T18:13:32Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Universal Phone Recognition with a Multilingual Allophone System [135.2254086165086]
We propose a joint model of language-independent phone and language-dependent phoneme distributions.
In multilingual ASR experiments over 11 languages, we find that this model improves testing performance by 2% phoneme error rate absolute.
Our recognizer achieves phone accuracy improvements of more than 17%, moving a step closer to speech recognition for all languages in the world.
arXiv Detail & Related papers (2020-02-26T21:28:57Z) - Towards Zero-shot Learning for Automatic Phonemic Transcription [82.9910512414173]
A more challenging problem is to build phonemic transcribers for languages with zero training data.
Our model is able to recognize unseen phonemes in the target language without any training data.
It achieves 7.7% better phoneme error rate on average over a standard multilingual model.
arXiv Detail & Related papers (2020-02-26T20:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.