Related papers: Tracing Multilingual Knowledge Acquisition Dynamics in Domain Adaptation: A Case Study of English-Japanese Biomedical Adaptation

Tracing Multilingual Knowledge Acquisition Dynamics in Domain Adaptation: A Case Study of English-Japanese Biomedical Adaptation

URL: http://arxiv.org/abs/2510.12115v1
Date: Tue, 14 Oct 2025 03:34:17 GMT
Title: Tracing Multilingual Knowledge Acquisition Dynamics in Domain Adaptation: A Case Study of English-Japanese Biomedical Adaptation
Authors: Xin Zhao, Naoki Yoshinaga, Yuma Tsuta, Akiko Aizawa,
Abstract summary: Multilingual domain adaptation (ML-DA) is widely used to learn new domain knowledge across languages into large language models (LLMs)<n>This work examines the learning dynamics of LLMs during ML-DA.<n>We propose AdaXEval, an adaptive evaluation method that builds multiple-choice QA datasets from the same bilingual domain corpus used for training.
Score: 25.87847731982824
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multilingual domain adaptation (ML-DA) is widely used to learn new domain knowledge across languages into large language models (LLMs). Although many methods have been proposed to improve domain adaptation, the mechanisms of multilingual knowledge acquisition, how domain knowledge is learned within a language and transferred across languages, remain underexplored. This gap leads to suboptimal performance, particularly in low-resource settings. This work examines the learning dynamics of LLMs during ML-DA. Because prior ML-DA studies often train and evaluate on datasets with mismatched knowledge coverage, we propose AdaXEval, an adaptive evaluation method that builds multiple-choice QA datasets from the same bilingual domain corpus used for training, thereby directly studying multilingual knowledge acquisition. Through continual training of LLMs with diverse data recipes, we track how LLMs acquire domain facts and pinpoint the mechanism behind the transformation process from domain training data to knowledge. Our experiments on a 13B English-Japanese bilingual LLM reveal that cross-lingual transfer remains challenging despite a high-quality bilingual corpus. The code has been released.

Related papers

LiveCLKTBench: Towards Reliable Evaluation of Cross-Lingual Knowledge Transfer in Multilingual LLMs [67.09110757873142]
We present LiveCLKTBench, an automated generation pipeline designed to isolate and measure cross-lingual knowledge transfer.<n>Our pipeline identifies self-contained, time-sensitive knowledge entities from real-world domains.<n>The documents of these valid entities are then used to generate factual questions, which are translated into multiple languages.
arXiv Detail & Related papers (2025-11-03T17:06:49Z)
The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model [59.357993924917]
We study the evolution of multilingual capabilities in large language models (LLMs) during the pre-training process.<n>We propose the Babel Tower Hypothesis, which describes the entire process of LLMs acquiring new language capabilities.<n>We propose a novel method to construct an optimized pre-training corpus for multilingual code LLMs.
arXiv Detail & Related papers (2024-12-10T08:28:57Z)
Multilingual Large Language Models: A Systematic Survey [38.972546467173565]
This paper provides a comprehensive survey of the latest research on multilingual large language models (MLLMs) We first discuss the architecture and pre-training objectives of MLLMs, highlighting the key components and methodologies that contribute to their multilingual capabilities. We present a detailed taxonomy and roadmap covering the assessment of MLLMs' cross-lingual knowledge, reasoning, alignment with human values, safety, interpretability and specialized applications.
arXiv Detail & Related papers (2024-11-17T13:21:26Z)
How Do Multilingual Language Models Remember Facts? [50.13632788453612]
We show that previously identified recall mechanisms in English largely apply to multilingual contexts.<n>We localize the role of language during recall, finding that subject enrichment is language-independent.<n>In decoder-only LLMs, FVs compose these two pieces of information in two separate stages.
arXiv Detail & Related papers (2024-10-18T11:39:34Z)
Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters [3.7273829129985305]
This paper explores integration of graph knowledge from linguistic into multilingual Large Models (LLMs)<n>We employ language-specific adapters to improve performance for low-resource languages (LRLs) in sentiment analysis (SA) and named entity recognition (NER)<n>We assess how structured graph knowledge affects the performance of multilingual LLMs for LRLs in SA and NER.
arXiv Detail & Related papers (2024-07-01T15:56:24Z)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.<n>But can these models relate corresponding concepts across languages, i.e., be crosslingual?<n>This study evaluates state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z)
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models [103.75890012041366]
Language models (LMs) have proven surprisingly successful at capturing factual knowledge. However, studies on LMs' factual representation ability have almost invariably been performed on English. We create a benchmark of cloze-style probes for 23 typologically diverse languages.
arXiv Detail & Related papers (2020-10-13T05:29:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.