A Computational Model for the Assessment of Mutual Intelligibility Among
Closely Related Languages
- URL: http://arxiv.org/abs/2402.02915v1
- Date: Mon, 5 Feb 2024 11:32:13 GMT
- Title: A Computational Model for the Assessment of Mutual Intelligibility Among
Closely Related Languages
- Authors: Jessica Nieder and Johann-Mattis List
- Abstract summary: Closely related languages show linguistic similarities that allow speakers of one language to understand speakers of another language without having actively learned it.
Mutual intelligibility varies in degree and is typically tested in psycholinguistic experiments.
We propose a computer-assisted method using the Linear Discriminative Learner to approximate the cognitive processes by which humans learn languages.
- Score: 1.5773159234875098
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Closely related languages show linguistic similarities that allow speakers of
one language to understand speakers of another language without having actively
learned it. Mutual intelligibility varies in degree and is typically tested in
psycholinguistic experiments. To study mutual intelligibility computationally,
we propose a computer-assisted method using the Linear Discriminative Learner,
a computational model developed to approximate the cognitive processes by which
humans learn languages, which we expand with multilingual semantic vectors and
multilingual sound classes. We test the model on cognate data from German,
Dutch, and English, three closely related Germanic languages. We find that our
model's comprehension accuracy depends on 1) the automatic trimming of
inflections and 2) the language pair for which comprehension is tested. Our
multilingual modelling approach does not only offer new methodological findings
for automatic testing of mutual intelligibility across languages but also
extends the use of Linear Discriminative Learning to multilingual settings.
Related papers
- Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies.
We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z) - The Role of Language Imbalance in Cross-lingual Generalisation: Insights from Cloned Language Experiments [57.273662221547056]
In this study, we investigate an unintuitive novel driver of cross-lingual generalisation: language imbalance.
We observe that the existence of a predominant language during training boosts the performance of less frequent languages.
As we extend our analysis to real languages, we find that infrequent languages still benefit from frequent ones, yet whether language imbalance causes cross-lingual generalisation there is not conclusive.
arXiv Detail & Related papers (2024-04-11T17:58:05Z) - The Less the Merrier? Investigating Language Representation in
Multilingual Models [8.632506864465501]
We investigate the linguistic representation of different languages in multilingual models.
We observe from our experiments that community-centered models perform better at distinguishing between languages in the same family for low-resource languages.
arXiv Detail & Related papers (2023-10-20T02:26:34Z) - Are Mutually Intelligible Languages Easier to Translate? [30.41671642147019]
We show that the amount of data needed to train a neural ma-chine translation model is anti-proportional to the languages' mutual intelligibility.
Experiments on the Romance language group reveal that there is indeed strong correlation between the area under a model's learning curve and mutual intelligibility scores obtained by studying human speakers.
arXiv Detail & Related papers (2022-01-31T09:22:23Z) - Exploring Teacher-Student Learning Approach for Multi-lingual
Speech-to-Intent Classification [73.5497360800395]
We develop an end-to-end system that supports multiple languages.
We exploit knowledge from a pre-trained multi-lingual natural language processing model.
arXiv Detail & Related papers (2021-09-28T04:43:11Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - Rediscovering the Slavic Continuum in Representations Emerging from
Neural Models of Spoken Language Identification [16.369477141866405]
We present a neural model for Slavic language identification in speech signals.
We analyze its emergent representations to investigate whether they reflect objective measures of language relatedness.
arXiv Detail & Related papers (2020-10-22T18:18:19Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.