PhoniTale: Phonologically Grounded Mnemonic Generation for Typologically Distant Language Pairs
- URL: http://arxiv.org/abs/2507.05444v3
- Date: Mon, 13 Oct 2025 06:35:48 GMT
- Title: PhoniTale: Phonologically Grounded Mnemonic Generation for Typologically Distant Language Pairs
- Authors: Sana Kang, Myeongseok Gwon, Su Young Kwon, Jaewook Lee, Andrew Lan, Bhiksha Raj, Rita Singh,
- Abstract summary: Large language models (LLMs) have been used to generate keyword mnemonics by leveraging similar keywords from a learner's first language (L1) to aid in acquiring L2 vocabulary.<n>We present PhoniTale, a novel cross-lingual mnemonic generation system that performs IPA-based phonological adaptation and syllable-aware alignment to retrieve L1 keyword sequence.<n>Our findings show that PhoniTale consistently outperforms previous automated approaches and achieves quality comparable to human-written mnemonics.
- Score: 51.745816131869674
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Vocabulary acquisition poses a significant challenge for second-language (L2) learners, especially when learning typologically distant languages such as English and Korean, where phonological and structural mismatches complicate vocabulary learning. Recently, large language models (LLMs) have been used to generate keyword mnemonics by leveraging similar keywords from a learner's first language (L1) to aid in acquiring L2 vocabulary. However, most methods still rely on direct IPA-based phonetic matching or employ LLMs without phonological guidance. In this paper, we present PhoniTale, a novel cross-lingual mnemonic generation system that performs IPA-based phonological adaptation and syllable-aware alignment to retrieve L1 keyword sequence and uses LLMs to generate verbal cues. We evaluate PhoniTale through automated metrics and a short-term recall test with human participants, comparing its output to human-written and prior automated mnemonics. Our findings show that PhoniTale consistently outperforms previous automated approaches and achieves quality comparable to human-written mnemonics.
Related papers
- WordCraft: Scaffolding the Keyword Method for L2 Vocabulary Learning with Multimodal LLMs [23.902522302562634]
We introduce WordCraft, a learner-centered interactive tool powered by Multimodal Large Language Models (MLLMs)<n> WordCraft scaffolds the keyword method by guiding learners through keyword selection, association construction, and image formation.<n>Two user studies demonstrate that WordCraft not only preserves the generation effect but also achieves high levels of effectiveness and usability.
arXiv Detail & Related papers (2026-01-31T14:59:43Z) - ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models [70.56468982313834]
We propose ProsodyLM, which introduces a simple tokenization scheme amenable to learning prosody.<n>We find that ProsodyLM can learn surprisingly diverse emerging prosody processing capabilities through pre-training alone.
arXiv Detail & Related papers (2025-07-27T00:59:01Z) - Languages in Multilingual Speech Foundation Models Align Both Phonetically and Semantically [58.019484208091534]
Cross-lingual alignment in pretrained language models (LMs) has enabled efficient transfer in text-based LMs.<n>It remains an open question whether findings and methods from text-based cross-lingual alignment apply to speech.
arXiv Detail & Related papers (2025-05-26T07:21:20Z) - Are BabyLMs Second Language Learners? [48.85680614529188]
This paper describes a linguistically-motivated approach to the 2024 edition of the BabyLM Challenge.
Rather than pursuing a first language learning (L1) paradigm, we approach the challenge from a second language (L2) learning perspective.
arXiv Detail & Related papers (2024-10-28T17:52:15Z) - Exploring Automated Keyword Mnemonics Generation with Large Language Models via Overgenerate-and-Rank [4.383205675898942]
Keywords mnemonics are a technique for memorizing vocabulary through memorable associations with a target word via a verbal cue.
We propose a novel overgenerate-and-rank method via prompting large language models to generate verbal cues.
Results show that LLM-generated mnemonics are comparable to human-generated ones in terms of imageability, coherence, and perceived usefulness.
arXiv Detail & Related papers (2024-09-21T00:00:18Z) - PhonologyBench: Evaluating Phonological Skills of Large Language Models [57.80997670335227]
Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research.
We present PhonologyBench, a novel benchmark consisting of three diagnostic tasks designed to explicitly test the phonological skills of LLMs.
We observe a significant gap of 17% and 45% on Rhyme Word Generation and Syllable counting, respectively, when compared to humans.
arXiv Detail & Related papers (2024-04-03T04:53:14Z) - Introducing Syllable Tokenization for Low-resource Languages: A Case Study with Swahili [29.252250069388687]
Tokenization allows for the words to be split based on characters or subwords, creating word embeddings that best represent the structure of the language.
We propose a syllable tokenizer and adopt an experiment-centric approach to validate the proposed tokenizer based on the Swahili language.
arXiv Detail & Related papers (2024-03-26T17:26:50Z) - Decomposed Prompting: Probing Multilingual Linguistic Structure Knowledge in Large Language Models [54.58989938395976]
We introduce a decomposed prompting approach for sequence labeling tasks.<n>We test our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages.
arXiv Detail & Related papers (2024-02-28T15:15:39Z) - Information-Theoretic Characterization of Vowel Harmony: A
Cross-Linguistic Study on Word Lists [18.138642719651994]
We define an information-theoretic measure of harmonicity based on predictability of vowels in a natural language lexicon.
We estimate this harmonicity using phoneme-level language models (PLMs)
Our work demonstrates that word lists are a valuable resource for typological research.
arXiv Detail & Related papers (2023-08-09T11:32:16Z) - Multilingual context-based pronunciation learning for Text-to-Speech [13.941800219395757]
Phonetic information and linguistic knowledge are an essential component of a Text-to-speech (TTS) front-end.
We showcase a multilingual unified front-end system that addresses any pronunciation related task, typically handled by separate modules.
We find that the multilingual model is competitive across languages and tasks, however, some trade-offs exists when compared to equivalent monolingual solutions.
arXiv Detail & Related papers (2023-07-31T14:29:06Z) - SmartPhone: Exploring Keyword Mnemonic with Auto-generated Verbal and
Visual Cues [2.8047215329139976]
We propose an end-to-end pipeline for auto-generating verbal and visual cues for keyword mnemonics.
Our approach, an end-to-end pipeline for auto-generating verbal and visual cues, can automatically generate highly memorable cues.
arXiv Detail & Related papers (2023-05-11T20:58:10Z) - Retrieval-Augmented Multilingual Keyphrase Generation with
Retriever-Generator Iterative Training [66.64843711515341]
Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text.
We call attention to a new setting named multilingual keyphrase generation.
We propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages.
arXiv Detail & Related papers (2022-05-21T00:45:21Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Perception Point: Identifying Critical Learning Periods in Speech for
Bilingual Networks [58.24134321728942]
We compare and identify cognitive aspects on deep neural-based visual lip-reading models.
We observe a strong correlation between these theories in cognitive psychology and our unique modeling.
arXiv Detail & Related papers (2021-10-13T05:30:50Z) - Applying Phonological Features in Multilingual Text-To-Speech [2.567123525861164]
We present a mapping of ARPABET/pinyin to SAMPA/SAMPA-SC and then to phonological features.
We tested whether this mapping could lead to the successful generation of native, non-native, and code-switched speech in the two languages.
arXiv Detail & Related papers (2021-10-07T16:37:01Z) - Automatically Identifying Language Family from Acoustic Examples in Low
Resource Scenarios [48.57072884674938]
We propose a method to analyze language similarity using deep learning.
Namely, we train a model on the Wilderness dataset and investigate how its latent space compares with classical language family findings.
arXiv Detail & Related papers (2020-12-01T22:44:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.