Second Language Acquisition of Neural Language Models
- URL: http://arxiv.org/abs/2306.02920v1
- Date: Mon, 5 Jun 2023 14:32:41 GMT
- Title: Second Language Acquisition of Neural Language Models
- Authors: Miyu Oba, Tatsuki Kuribayashi, Hiroki Ouchi, Taro Watanabe
- Abstract summary: This work sheds light on the second language (L2) acquisition of neural language models (LMs)
We trained bilingual LMs with a scenario similar to human L2 acquisition and analyzed their cross-lingual transfer from linguistic perspectives.
- Score: 17.356128991925576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the success of neural language models (LMs), their language acquisition
has gained much attention. This work sheds light on the second language (L2)
acquisition of LMs, while previous work has typically explored their first
language (L1) acquisition. Specifically, we trained bilingual LMs with a
scenario similar to human L2 acquisition and analyzed their cross-lingual
transfer from linguistic perspectives. Our exploratory experiments demonstrated
that the L1 pretraining accelerated their linguistic generalization in L2, and
language transfer configurations (e.g., the L1 choice, and presence of parallel
texts) substantially affected their generalizations. These clarify their
(non-)human-like L2 acquisition in particular aspects.
Related papers
- Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases [22.048949559200935]
This study evaluates Large Language Models' ability to simulate non-native-like English use observed in human second language (L2) learners.
In dialogue-based interviews, we prompt LLMs to mimic L2 English learners with specific L1s across seven languages.
Our analysis examines L1-driven linguistic biases, such as reference word usage and avoidance behaviors, using information-theoretic and distributional density measures.
arXiv Detail & Related papers (2025-02-20T12:34:46Z) - The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model [59.357993924917]
We study the evolution of multilingual capabilities in large language models (LLMs) during the pre-training process.
We propose the Babel Tower Hypothesis, which describes the entire process of LLMs acquiring new language capabilities.
We propose a novel method to construct an optimized pre-training corpus for multilingual code LLMs.
arXiv Detail & Related papers (2024-12-10T08:28:57Z) - How Do Multilingual Language Models Remember Facts? [50.13632788453612]
We show that previously identified recall mechanisms in English largely apply to multilingual contexts.
We localize the role of language during recall, finding that subject enrichment is language-independent.
In decoder-only LLMs, FVs compose these two pieces of information in two separate stages.
arXiv Detail & Related papers (2024-10-18T11:39:34Z) - Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models [11.423589362950812]
Large language models (LLMs) have demonstrated remarkable performance, particularly in multilingual contexts.
Recent studies suggest that LLMs can transfer skills learned in one language to others, but the internal mechanisms behind this ability remain unclear.
This paper provides insights into the internal workings of LLMs, offering a foundation for future improvements in their cross-lingual capabilities.
arXiv Detail & Related papers (2024-10-15T15:49:15Z) - Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.
We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs.
Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z) - SLABERT Talk Pretty One Day: Modeling Second Language Acquisition with
BERT [0.0]
Cross-linguistic transfer is the influence of linguistic structure of a speaker's native language on the successful acquisition of a foreign language.
We find that NLP literature has not given enough attention to the phenomenon of negative transfer.
Our findings call for further research using our novel Transformer-based SLA models.
arXiv Detail & Related papers (2023-05-31T06:22:07Z) - Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations.
We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.