Can you map it to English? The Role of Cross-Lingual Alignment in Multilingual Performance of LLMs
- URL: http://arxiv.org/abs/2504.09378v2
- Date: Tue, 15 Apr 2025 11:49:34 GMT
- Title: Can you map it to English? The Role of Cross-Lingual Alignment in Multilingual Performance of LLMs
- Authors: Kartik Ravisankar, Hyojung Han, Marine Carpuat,
- Abstract summary: Large language models (LLMs) pre-trained predominantly on English text exhibit surprising multilingual capabilities.<n>We introduce cross-lingual alignment metrics to quantify the alignment at an instance level for discriminative tasks.<n>We find that while cross-lingual alignment metrics strongly correlate with task accuracy at the language level, the sample-level alignment often fails to distinguish correct from incorrect predictions.
- Score: 12.334510055293535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) pre-trained predominantly on English text exhibit surprising multilingual capabilities, yet the mechanisms driving cross-lingual generalization remain poorly understood. This work investigates how the alignment of representations for text written in different languages correlates with LLM performance on natural language understanding tasks and translation tasks, both at the language and the instance level. For this purpose, we introduce cross-lingual alignment metrics such as the Discriminative Alignment Index (DALI) to quantify the alignment at an instance level for discriminative tasks. Through experiments on three natural language understanding tasks (Belebele, XStoryCloze, XCOPA), and machine translation, we find that while cross-lingual alignment metrics strongly correlate with task accuracy at the language level, the sample-level alignment often fails to distinguish correct from incorrect predictions, exposing alignment as a necessary but insufficient condition for success.
Related papers
- The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context [0.9130277390156759]
Alignment tuning has enabled large language models to excel in reasoning, instruction-following, and minimizing harmful generations.<n>Despite their widespread deployment, these models exhibit a monolingual bias, raising concerns about the effectiveness of alignment across languages.<n>Current alignment methods predominantly focus on English, leaving it unclear how alignment mechanism generalizes to multilingual settings.
arXiv Detail & Related papers (2025-04-03T15:46:46Z) - Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs [20.756383171892608]
Middle layers exhibit strongest potential for cross-lingual alignment.<n>Experiments on slot filling, machine translation, and structured text generation show consistent improvements in cross-lingual transfer.<n>We show that separately trained alignment modules can be merged with existing task-specific modules, improving cross-lingual capabilities without full re-training.
arXiv Detail & Related papers (2025-02-20T18:45:43Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual
Retrieval [51.60862829942932]
We present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
For sentence-level CLIR, we demonstrate that state-of-the-art performance can be achieved.
However, the peak performance is not met using the general-purpose multilingual text encoders off-the-shelf', but rather relying on their variants that have been further specialized for sentence understanding tasks.
arXiv Detail & Related papers (2021-01-21T00:15:38Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - On the Language Neutrality of Pre-trained Multilingual Representations [70.93503607755055]
We investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics.
Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings.
We show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences.
arXiv Detail & Related papers (2020-04-09T19:50:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.