Language Models as Artificial Learners: Investigating Crosslinguistic Influence
- URL: http://arxiv.org/abs/2601.21587v1
- Date: Thu, 29 Jan 2026 11:53:48 GMT
- Title: Language Models as Artificial Learners: Investigating Crosslinguistic Influence
- Authors: Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis,
- Abstract summary: We study the effect of varying the L1 language dominance and the L2 language proficiency.<n>Using cross-linguistic priming, we analyze how activating L1 structures impacts L2 processing.
- Score: 11.168086425477467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the centrality of crosslinguistic influence (CLI) to bilingualism research, human studies often yield conflicting results due to inherent experimental variance. We address these inconsistencies by using language models (LMs) as controlled statistical learners to systematically simulate CLI and isolate its underlying drivers. Specifically, we study the effect of varying the L1 language dominance and the L2 language proficiency, which we manipulate by controlling the L2 age of exposure -- defined as the training step at which the L2 is introduced. Furthermore, we investigate the impact of pretraining on L1 languages with varying syntactic distance from the L2. Using cross-linguistic priming, we analyze how activating L1 structures impacts L2 processing. Our results align with evidence from psycholinguistic studies, confirming that language dominance and proficiency are strong predictors of CLI. We further find that while priming of grammatical structures is bidirectional, the priming of ungrammatical structures is sensitive to language dominance. Finally, we provide mechanistic evidence of CLI in LMs, demonstrating that the L1 is co-activated during L2 processing and directly influences the neural circuitry recruited for the L2. More broadly, our work demonstrates that LMs can serve as a computational framework to inform theories of human CLI.
Related papers
- Code-Switching In-Context Learning for Cross-Lingual Transfer of Large Language Models [64.54005959758733]
We introduce code-switching in-context learning (CSICL) as a principled and robust approach for overcoming the translation barrier during inference.<n>We conduct extensive experiments across 4 LLMs, 6 datasets, and 10 languages, spanning both knowledge-intensive and reasoning-oriented domains.<n>Our results demonstrate CSICL consistently outperforms X-ICL baselines, achieving gains of 3.1%p and 1.9%p in both target and unseen languages.
arXiv Detail & Related papers (2025-10-07T08:35:42Z) - Do Self-Supervised Speech Models Exhibit the Critical Period Effects in Language Acquisition? [13.643286736802414]
Critical Period (CP) effects in human language acquisition are observed in self-supervised speech models (S3Ms)<n>CP effects refer to greater difficulty in acquiring a second language (L2) with delayed L2 exposure onset, and greater retention of their first language (L1) with delayed L1 exposure offset.<n>We train S3Ms with varying L2 training onsets and L1 training offsets on child-directed speech and evaluate their phone discrimination performance.<n>We find that S3Ms do not exhibit clear evidence of either CP effects in terms of phonological acquisition.
arXiv Detail & Related papers (2025-08-28T20:56:16Z) - When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners [111.50503126693444]
We show that language-specific ablation consistently boosts multilingual reasoning performance.<n>Compared to post-training, our training-free ablation achieves comparable or superior results with minimal computational overhead.
arXiv Detail & Related papers (2025-05-21T08:35:05Z) - Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases [22.048949559200935]
This study evaluates Large Language Models' ability to simulate non-native-like English use observed in human second language (L2) learners.<n>In dialogue-based interviews, we prompt LLMs to mimic L2 English learners with specific L1s across seven languages.<n>Our analysis examines L1-driven linguistic biases, such as reference word usage and avoidance behaviors, using information-theoretic and distributional density measures.
arXiv Detail & Related papers (2025-02-20T12:34:46Z) - Investigating Critical Period Effects in Language Acquisition through Neural Language Models [70.6367059367609]
Second language (L2) acquisition becomes harder after early childhood.
ceasing exposure to a first language (L1) after this period (but not before) typically does not lead to substantial loss of L1 proficiency.
It is unknown whether these CP effects result from innately determined brain maturation or as a stabilization of neural connections naturally induced by experience.
arXiv Detail & Related papers (2024-07-27T19:17:10Z) - ICLEval: Evaluating In-Context Learning Ability of Large Language Models [68.7494310749199]
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs.<n>Existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability.<n>We introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning.
arXiv Detail & Related papers (2024-06-21T08:06:10Z) - Analyzing and Adapting Large Language Models for Few-Shot Multilingual
NLU: Are We There Yet? [82.02076369811402]
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
We present an extensive and systematic comparison of the three approaches, testing them on 6 high- and low-resource languages, three different NLU tasks, and a myriad of language and domain setups.
Our observations show that supervised instruction tuning has the best trade-off between performance and resource requirements.
arXiv Detail & Related papers (2024-03-04T10:48:13Z) - Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.
We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.
Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z) - Self-Augmented In-Context Learning for Unsupervised Word Translation [23.495503962839337]
Large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups.
We propose self-augmented in-context learning (SAIL) for unsupervised BLI.
Our method shows substantial gains over zero-shot prompting of LLMs on two established BLI benchmarks.
arXiv Detail & Related papers (2024-02-15T15:43:05Z) - Second Language Acquisition of Neural Language Models [17.356128991925576]
This work sheds light on the second language (L2) acquisition of neural language models (LMs)
We trained bilingual LMs with a scenario similar to human L2 acquisition and analyzed their cross-lingual transfer from linguistic perspectives.
arXiv Detail & Related papers (2023-06-05T14:32:41Z) - SLABERT Talk Pretty One Day: Modeling Second Language Acquisition with
BERT [0.0]
Cross-linguistic transfer is the influence of linguistic structure of a speaker's native language on the successful acquisition of a foreign language.
We find that NLP literature has not given enough attention to the phenomenon of negative transfer.
Our findings call for further research using our novel Transformer-based SLA models.
arXiv Detail & Related papers (2023-05-31T06:22:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.