Related papers: Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark

Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark

URL: http://arxiv.org/abs/2602.16811v1
Date: Wed, 18 Feb 2026 19:15:30 GMT
Title: Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark
Authors: Charalampos Mastrokostas, Nikolaos Giarelis, Nikos Karacapilidis,
Abstract summary: Large Language Models (LLMs) have advanced the state-of-the-art across a wide range of tasks, including Question Answering (QA)<n>Recent advancements in Natural Language Processing and Deep Learning have enabled the development of Large Language Models (LLMs)
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in Natural Language Processing and Deep Learning have enabled the development of Large Language Models (LLMs), which have significantly advanced the state-of-the-art across a wide range of tasks, including Question Answering (QA). Despite these advancements, research on LLMs has primarily targeted high-resourced languages (e.g., English), and only recently has attention shifted toward multilingual models. However, these models demonstrate a training data bias towards a small number of popular languages or rely on transfer learning from high- to under-resourced languages; this may lead to a misrepresentation of social, cultural, and historical aspects. To address this challenge, monolingual LLMs have been developed for under-resourced languages; however, their effectiveness remains less studied when compared to multilingual counterparts on language-specific tasks. In this study, we address this research gap in Greek QA by contributing: (i) DemosQA, a novel dataset, which is constructed using social media user questions and community-reviewed answers to better capture the Greek social and cultural zeitgeist; (ii) a memory-efficient LLM evaluation framework adaptable to diverse QA datasets and languages; and (iii) an extensive evaluation of 11 monolingual and multilingual LLMs on 6 human-curated Greek QA datasets using 3 different prompting strategies. We release our code and data to facilitate reproducibility.

Related papers

Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models [1.175067374181304]
Code-switching, the alternation of languages and scripts within a single utterance, remains a fundamental challenge for multilingual NLP.<n>Most large language models (LLMs) struggle with mixed-language inputs, limited CSW datasets, and evaluation biases.<n>This survey provides the first comprehensive analysis of CSW-aware LLM research.
arXiv Detail & Related papers (2025-10-08T14:04:14Z)
SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs [12.60449414234283]
SpokenNativQA is the first multilingual and culturally aligned spoken question-answering dataset.<n>The dataset comprises approximately 33,000 naturally spoken questions and answers in multiple languages.
arXiv Detail & Related papers (2025-05-25T14:22:18Z)
Evaluating Large Language Model with Knowledge Oriented Language Specific Simple Question Answering [73.73820209993515]
We introduce KoLasSimpleQA, the first benchmark evaluating the multilingual factual ability of Large Language Models (LLMs)<n>Inspired by existing research, we created the question set with features such as single knowledge point coverage, absolute objectivity, unique answers, and temporal stability.<n>Results show significant performance differences between the two domains.
arXiv Detail & Related papers (2025-05-22T12:27:02Z)
A Survey of Multilingual Reasoning in Language Models [30.140967158580892]
This survey provides the first in-depth review of multilingual reasoning in language models.<n>We provide an overview of the standard data resources used for training multilingual reasoning in LMs.<n>We analyze various state-of-the-art methods and their performance on these benchmarks.
arXiv Detail & Related papers (2025-02-13T16:25:16Z)
How Do Multilingual Language Models Remember Facts? [50.13632788453612]
We show that previously identified recall mechanisms in English largely apply to multilingual contexts.<n>We localize the role of language during recall, finding that subject enrichment is language-independent.<n>In decoder-only LLMs, FVs compose these two pieces of information in two separate stages.
arXiv Detail & Related papers (2024-10-18T11:39:34Z)
A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers [51.8203871494146]
The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing.<n>Despite the breakthroughs of LLMs, the investigation into the multilingual scenario remains insufficient.<n>This survey aims to help the research community address multilingual problems and provide a comprehensive understanding of the core concepts, key techniques, and latest developments in multilingual natural language processing based on LLMs.
arXiv Detail & Related papers (2024-05-17T17:47:39Z)
Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language [7.289015788793582]
This work focuses on increasing technological participation for the S'ami language. We draw the attention of the ML community towards the language modeling problem of Ultra Low Resource (ULR) languages. We have compiled the available S'ami language resources from the web to create a clean dataset for training language models.
arXiv Detail & Related papers (2024-05-09T13:54:22Z)
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers [81.47046536073682]
We present a review and provide a unified perspective to summarize the recent progress as well as emerging trends in multilingual large language models (MLLMs) literature. We hope our work can provide the community with quick access and spur breakthrough research in MLLMs.
arXiv Detail & Related papers (2024-04-07T11:52:44Z)
Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages [51.301942056881146]
We investigate how large language models (LLMs) function as rerankers in cross-lingual information retrieval systems for African languages. Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba) We examine cross-lingual reranking with queries in English and passages in the African languages.
arXiv Detail & Related papers (2023-12-26T18:38:54Z)
PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B. To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training. Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.