Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback
- URL: http://arxiv.org/abs/2506.04920v1
- Date: Thu, 05 Jun 2025 11:53:04 GMT
- Title: Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback
- Authors: Junior Cedric Tonga, KV Aditya Srivatsa, Kaushal Kumar Maurya, Fajri Koto, Ekaterina Kochmar,
- Abstract summary: We present the first large-scale simulation of multilingual tutor-student interactions using large language models (LLMs)<n>A stronger model plays the role of the tutor, generating feedback in the form of hints, while a weaker model simulates the student.<n>Our study examines how student input language, teacher feedback language, model choice, and language resource level jointly influence performance.
- Score: 11.889826908536941
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have demonstrated the ability to generate formative feedback and instructional hints in English, making them increasingly relevant for AI-assisted education. However, their ability to provide effective instructional support across different languages, especially for mathematically grounded reasoning tasks, remains largely unexamined. In this work, we present the first large-scale simulation of multilingual tutor-student interactions using LLMs. A stronger model plays the role of the tutor, generating feedback in the form of hints, while a weaker model simulates the student. We explore 352 experimental settings across 11 typologically diverse languages, four state-of-the-art LLMs, and multiple prompting strategies to assess whether language-specific feedback leads to measurable learning gains. Our study examines how student input language, teacher feedback language, model choice, and language resource level jointly influence performance. Results show that multilingual hints can significantly improve learning outcomes, particularly in low-resource languages when feedback is aligned with the student's native language. These findings offer practical insights for developing multilingual, LLM-based educational tools that are both effective and inclusive.
Related papers
- The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model [59.357993924917]
We study the evolution of multilingual capabilities in large language models (LLMs) during the pre-training process.<n>We propose the Babel Tower Hypothesis, which describes the entire process of LLMs acquiring new language capabilities.<n>We propose a novel method to construct an optimized pre-training corpus for multilingual code LLMs.
arXiv Detail & Related papers (2024-12-10T08:28:57Z) - Improving Bilingual Capabilities of Language Models to Support Diverse Linguistic Practices in Education [3.799331337558008]
Large language models (LLMs) offer promise in generating educational content, providing instructor feedback, and reducing teacher workload on assessments.
We study the effectiveness of multilingual large language models (MLLMs) across monolingual (English-only, Spanish-only) and bilingual (Spanglish) student writing.
arXiv Detail & Related papers (2024-11-06T23:16:25Z) - Teaching LLMs to Abstain across Languages via Multilingual Feedback [40.84205285309612]
We show that multilingual feedback helps identify knowledge gaps across diverse languages, cultures, and communities.<n>Extensive experiments demonstrate that our multilingual feedback approach outperforms various strong baselines.<n>Further analysis reveals that multilingual feedback is both an effective and a more equitable abstain strategy to serve diverse language speakers.
arXiv Detail & Related papers (2024-06-22T21:59:12Z) - Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners [67.85635044939836]
Large Language Models (LLMs) have shown impressive language capabilities.
In this work, we investigate the spontaneous multilingual alignment improvement of LLMs.
We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages.
arXiv Detail & Related papers (2024-05-22T16:46:19Z) - Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models [79.46179534911019]
Large language models (LLMs) have demonstrated multilingual capabilities, yet they are mostly English-centric due to imbalanced training corpora.<n>We extend the evaluation to real-world user queries and non-English-centric LLMs, offering a broader examination of multilingual performance.
arXiv Detail & Related papers (2024-03-15T12:47:39Z) - How do Large Language Models Handle Multilingualism? [81.15060972112563]
This study explores how large language models (LLMs) handle multilingualism.
LLMs initially understand the query, converting multilingual inputs into English for task-solving.
In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures.
arXiv Detail & Related papers (2024-02-29T02:55:26Z) - Decomposed Prompting: Unveiling Multilingual Linguistic Structure
Knowledge in English-Centric Large Language Models [12.700783525558721]
English-centric Large Language Models (LLMs) like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks.
This paper introduces the decomposed prompting approach to probe the linguistic structure understanding of these LLMs in sequence labeling tasks.
arXiv Detail & Related papers (2024-02-28T15:15:39Z) - Zero-Shot Cross-Lingual Reranking with Large Language Models for
Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages.
Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba)
We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z) - Spoken Language Intelligence of Large Language Models for Language Learning [3.1964044595140217]
We focus on evaluating the efficacy of large language models (LLMs) in the realm of education.<n>We introduce a new multiple-choice question dataset to evaluate the effectiveness of LLMs in the aforementioned scenarios.<n>We also investigate the influence of various prompting techniques such as zero- and few-shot method.<n>We find that models of different sizes have good understanding of concepts in phonetics, phonology, and second language acquisition, but show limitations in reasoning for real-world problems.
arXiv Detail & Related papers (2023-08-28T12:47:41Z) - Okapi: Instruction-tuned Large Language Models in Multiple Languages
with Reinforcement Learning from Human Feedback [61.83548032416181]
We present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages.
Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.
arXiv Detail & Related papers (2023-07-29T18:01:46Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Don't Trust ChatGPT when Your Question is not in English: A Study of
Multilingual Abilities and Types of LLMs [16.770697902481107]
Large Language Models (LLMs) have demonstrated exceptional natural language understanding abilities.
We propose a systematic way of qualifying the performance disparities of LLMs under multilingual settings.
The results show that GPT exhibits highly translating-like behaviour in multilingual settings.
arXiv Detail & Related papers (2023-05-24T02:05:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.