Related papers: Second Language Acquisition of Neural Language Models

Second Language Acquisition of Neural Language Models

URL: http://arxiv.org/abs/2306.02920v1
Date: Mon, 5 Jun 2023 14:32:41 GMT
Title: Second Language Acquisition of Neural Language Models
Authors: Miyu Oba, Tatsuki Kuribayashi, Hiroki Ouchi, Taro Watanabe
Abstract summary: This work sheds light on the second language (L2) acquisition of neural language models (LMs) We trained bilingual LMs with a scenario similar to human L2 acquisition and analyzed their cross-lingual transfer from linguistic perspectives.
Score: 17.356128991925576
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the success of neural language models (LMs), their language acquisition has gained much attention. This work sheds light on the second language (L2) acquisition of LMs, while previous work has typically explored their first language (L1) acquisition. Specifically, we trained bilingual LMs with a scenario similar to human L2 acquisition and analyzed their cross-lingual transfer from linguistic perspectives. Our exploratory experiments demonstrated that the L1 pretraining accelerated their linguistic generalization in L2, and language transfer configurations (e.g., the L1 choice, and presence of parallel texts) substantially affected their generalizations. These clarify their (non-)human-like L2 acquisition in particular aspects.

Related papers

LLMs syntactically adapt their language use to their conversational partner [58.92470092706263]
It has been frequently observed that human speakers align their language use with each other during conversations. We construct a corpus of conversations between large language models (LLMs) and find that two LLM agents end up making more similar syntactic choices as conversations go on.
arXiv Detail & Related papers (2025-03-10T15:37:07Z)
Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases [22.048949559200935]
This study evaluates Large Language Models' ability to simulate non-native-like English use observed in human second language (L2) learners. In dialogue-based interviews, we prompt LLMs to mimic L2 English learners with specific L1s across seven languages. Our analysis examines L1-driven linguistic biases, such as reference word usage and avoidance behaviors, using information-theoretic and distributional density measures.
arXiv Detail & Related papers (2025-02-20T12:34:46Z)
The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model [59.357993924917]
We study the evolution of multilingual capabilities in large language models (LLMs) during the pre-training process. We propose the Babel Tower Hypothesis, which describes the entire process of LLMs acquiring new language capabilities. We propose a novel method to construct an optimized pre-training corpus for multilingual code LLMs.
arXiv Detail & Related papers (2024-12-10T08:28:57Z)
How Do Multilingual Models Remember? Investigating Multilingual Factual Recall Mechanisms [50.13632788453612]
Large Language Models (LLMs) store and retrieve vast amounts of factual knowledge acquired during pre-training. The question of how these processes generalize to other languages and multilingual LLMs remains unexplored. We examine when language plays a role in the recall process, uncovering evidence of language-independent and language-dependent mechanisms.
arXiv Detail & Related papers (2024-10-18T11:39:34Z)
Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models [11.423589362950812]
Large language models (LLMs) have demonstrated remarkable performance, particularly in multilingual contexts. Recent studies suggest that LLMs can transfer skills learned in one language to others, but the internal mechanisms behind this ability remain unclear. This paper provides insights into the internal workings of LLMs, offering a foundation for future improvements in their cross-lingual capabilities.
arXiv Detail & Related papers (2024-10-15T15:49:15Z)
Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs) It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs. It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora. But can these models relate corresponding concepts across languages, effectively being crosslingual? This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
SLABERT Talk Pretty One Day: Modeling Second Language Acquisition with BERT [0.0]
Cross-linguistic transfer is the influence of linguistic structure of a speaker's native language on the successful acquisition of a foreign language. We find that NLP literature has not given enough attention to the phenomenon of negative transfer. Our findings call for further research using our novel Transformer-based SLA models.
arXiv Detail & Related papers (2023-05-31T06:22:07Z)
Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations. We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z)
A bifurcation threshold for contact-induced language change [0.0]
This paper proposes a mathematical model of such situations based on reinforcement learning and nonlinear dynamics. The model is evaluated with the help of two case studies, morphological levelling in Afrikaans and the erosion of null subjects in Afro-Peruvian Spanish.
arXiv Detail & Related papers (2021-11-23T18:21:12Z)
A Primer on Pretrained Multilingual Language Models [18.943173499882885]
Multilingual Language Models (MLLMs) have emerged as a viable option for bringing the power of pretraining to a large number of languages. We review the existing literature covering the above broad areas of research pertaining to MLLMs.
arXiv Detail & Related papers (2021-07-01T18:01:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.