Related papers: Decomposed Prompting: Probing Multilingual Linguistic Structure Knowledge in Large Language Models

Related papers

PART: Progressive Alignment Representation Training for Multilingual Speech-To-Text with LLMs [58.2469845374385]
We introduce Progressive Alignment Representation Training (PART)<n>PART is a multi-stage and multi-task framework that separates within-language from cross-language alignment.<n>Experiments on CommonVoice 15, Fleurs, Wenetspeech, and CoVoST2 show that PART surpasses conventional approaches.
arXiv Detail & Related papers (2025-09-24T03:54:14Z)
Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback [11.889826908536941]
We present the first large-scale simulation of multilingual tutor-student interactions using large language models (LLMs)<n>A stronger model plays the role of the tutor, generating feedback in the form of hints, while a weaker model simulates the student.<n>Our study examines how student input language, teacher feedback language, model choice, and language resource level jointly influence performance.
arXiv Detail & Related papers (2025-06-05T11:53:04Z)
Linguistic Blind Spots of Large Language Models [14.755831733659699]
We study the performance of recent large language models (LLMs) on linguistic annotation tasks.<n>We find that recent LLMs show limited efficacy in addressing linguistic queries and often struggle with linguistically complex inputs.<n>Our results provide insights to inform future advancements in LLM design and development.
arXiv Detail & Related papers (2025-03-25T01:47:13Z)
PolyPrompt: Automating Knowledge Extraction from Multilingual Language Models with Dynamic Prompt Generation [0.0]
We introduce PolyPrompt, a novel, parameter-efficient framework for enhancing the multilingual capabilities of large language models (LLMs)<n>Our method learns a set of trigger tokens for each language through a gradient-based search, identifying the input query's language and selecting the corresponding trigger tokens which are prepended to the prompt during inference.<n>We perform experiments on two 1 billion parameter models, with evaluations on the global MMLU benchmark across fifteen typologically and resource diverse languages, demonstrating accuracy gains of 3.7%-19.9% compared to naive and translation-pipeline baselines.
arXiv Detail & Related papers (2025-02-27T04:41:22Z)
Prompt and circumstance: A word-by-word LLM prompting approach to interlinear glossing for low-resource languages [6.4977738682502295]
We investigate the effectiveness of a retrieval-based LLM prompting approach to glossing, applied to the seven languages from the SIGMORPHON 2023 shared task.<n>Our system beats the BERT-based shared task baseline for every language in the morpheme-level score category.<n>In a case study on Tsez, we ask the LLM to automatically create and follow linguistic instructions, reducing errors on a confusing grammatical feature.
arXiv Detail & Related papers (2025-02-13T21:23:16Z)
Beyond English: The Impact of Prompt Translation Strategies across Languages and Tasks in Multilingual LLMs [13.458891794688551]
We evaluate pre-translation strategies across 35 languages covering both low and high-resource languages. Our experiments show the impact of factors as similarity to English, translation quality and the size of pre-trained data, on the model performance with pre-translation.
arXiv Detail & Related papers (2025-02-13T13:49:30Z)
How does a Multilingual LM Handle Multiple Languages? [0.0]
This study critically examines capabilities in multilingual understanding, semantic representation, and cross-lingual knowledge transfer. It assesses semantic similarity by analyzing multilingual word embeddings for consistency using cosine similarity. It examines BLOOM-1.7B and Qwen2 through Named Entity Recognition and sentence similarity tasks to understand their linguistic structures.
arXiv Detail & Related papers (2025-02-06T18:08:14Z)
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model [66.17354128553244]
Most Large Vision-Language Models (LVLMs) to date are trained predominantly on English data. We investigate how different training mixes tip the scale for different groups of languages. We train Centurio, a 100-language LVLM, offering state-of-the-art performance in an evaluation covering 14 tasks and 56 languages.
arXiv Detail & Related papers (2025-01-09T10:26:14Z)
Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models [52.00446751692225]
We present a novel and simple yet effective method called textbfDictionary textbfInsertion textbfPrompting (textbfDIP) When providing a non-English prompt, DIP looks up a word dictionary and inserts words' English counterparts into the prompt for LLMs. It then enables better translation into English and better English model thinking steps which leads to obviously better results.
arXiv Detail & Related papers (2024-11-02T05:10:50Z)
Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs [13.558778781305998]
Large Language Models (LLMs) are predominantly designed with English as the primary language. Even the few that are multilingual tend to exhibit strong English-centric biases. This paper introduces novel automatic corpus-level metrics to assess the lexical and syntactic naturalness of multilingual outputs.
arXiv Detail & Related papers (2024-10-21T12:34:17Z)
How Do Multilingual Language Models Remember Facts? [50.13632788453612]
We show that previously identified recall mechanisms in English largely apply to multilingual contexts. We localize the role of language during recall, finding that subject enrichment is language-independent. In decoder-only LLMs, FVs compose these two pieces of information in two separate stages.
arXiv Detail & Related papers (2024-10-18T11:39:34Z)
Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs) It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs. It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z)
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model [14.39119862985503]
We aim to create a multilingual ALT system with available datasets. Inspired by architectures that have been proven effective for English ALT, we adapt these techniques to the multilingual scenario. We evaluate the performance of the multilingual model in comparison to its monolingual counterparts.
arXiv Detail & Related papers (2024-06-25T15:02:32Z)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora. But can these models relate corresponding concepts across languages, effectively being crosslingual? This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z)
Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models [79.46179534911019]
Large language models (LLMs) have demonstrated multilingual capabilities; yet, they are mostly English-centric due to imbalanced training corpora. This work extends the evaluation from NLP tasks to real user queries. For culture-related tasks that need deep language understanding, prompting in the native language tends to be more promising.
arXiv Detail & Related papers (2024-03-15T12:47:39Z)
Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations [38.56175462620892]
Large pretrained multilingual language models (ML-LMs) have shown remarkable capabilities of zero-shot cross-lingual transfer. We present a novel view of projecting away language-specific factors from a multilingual embedding space. We show that applying our method consistently leads to improvements over commonly used ML-LMs.
arXiv Detail & Related papers (2024-01-11T09:54:11Z)
How Vocabulary Sharing Facilitates Multilingualism in LLaMA? [19.136382859468693]
Large Language Models (LLMs) often show strong performance on English tasks, while exhibiting limitations on other languages. This study endeavors to examine the multilingual capability of LLMs from the vocabulary sharing perspective.
arXiv Detail & Related papers (2023-11-15T16:13:14Z)
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars. We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English. Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z)
Don't Trust ChatGPT when Your Question is not in English: A Study of Multilingual Abilities and Types of LLMs [16.770697902481107]
Large Language Models (LLMs) have demonstrated exceptional natural language understanding abilities. We propose a systematic way of qualifying the performance disparities of LLMs under multilingual settings. The results show that GPT exhibits highly translating-like behaviour in multilingual settings.
arXiv Detail & Related papers (2023-05-24T02:05:03Z)
Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks. We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT) We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z)
Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks. We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset. To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z)
Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks. We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking. We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z)
Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs. We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models. We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z)
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt [98.26682501616024]
We propose a novel model that uses a unified prompt for all languages, called UniPrompt. The unified prompt is computation by a multilingual PLM to produce language-independent representation. Our proposed methods can significantly outperform the strong baselines across different languages.
arXiv Detail & Related papers (2022-02-23T11:57:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.