It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models
- URL: http://arxiv.org/abs/2601.08500v1
- Date: Tue, 13 Jan 2026 12:36:38 GMT
- Title: It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models
- Authors: Cristian Santini, Marieke Van Erp, Mehwish Alam,
- Abstract summary: MHEL-LLaMo is an unsupervised ensemble approach combining a Small Language Model (SLM) and an LLM.<n>We evaluate MHEL-LLaMo on four established benchmarks in six European languages.<n>Results demonstrate that MHEL-LLaMo outperforms state-of-the-art models without requiring fine-tuning.
- Score: 1.6407393639625105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the recent advancements in NLP with the advent of Large Language Models (LLMs), Entity Linking (EL) for historical texts remains challenging due to linguistic variation, noisy inputs, and evolving semantic conventions. Existing solutions either require substantial training data or rely on domain-specific rules that limit scalability. In this paper, we present MHEL-LLaMo (Multilingual Historical Entity Linking with Large Language MOdels), an unsupervised ensemble approach combining a Small Language Model (SLM) and an LLM. MHEL-LLaMo leverages a multilingual bi-encoder (BELA) for candidate retrieval and an instruction-tuned LLM for NIL prediction and candidate selection via prompt chaining. Our system uses SLM's confidence scores to discriminate between easy and hard samples, applying an LLM only for hard cases. This strategy reduces computational costs while preventing hallucinations on straightforward cases. We evaluate MHEL-LLaMo on four established benchmarks in six European languages (English, Finnish, French, German, Italian and Swedish) from the 19th and 20th centuries. Results demonstrate that MHEL-LLaMo outperforms state-of-the-art models without requiring fine-tuning, offering a scalable solution for low-resource historical EL. The implementation of MHEL-LLaMo is available on Github.
Related papers
- MiLorE-SSL: Scaling Multilingual Capabilities in Self-Supervised Models without Forgetting [69.6938830307759]
MiLorE-SSL is a lightweight framework that combines LoRA modules with a soft mixture-of-experts mechanism for efficient continual multilingual training.<n>LoRA provides efficient low-rank adaptation, while soft MoE promotes flexible expert sharing across languages, reducing cross-lingual interference.<n>Experiments on ML-SUPERB demonstrate that MiLorE-SSL achieves strong performance in new languages and improves the ability in existing ones with only 2.14% trainable parameters.
arXiv Detail & Related papers (2026-01-28T06:48:52Z) - Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data [18.87770758217633]
Lemmatization is the task of transforming all words in a given text to their dictionary forms.<n>There is no prior evidence of how effective large language models are in the contextual lemmatization task.<n>This paper empirically investigates the capacity of the latest generation of LLMs to perform in-context lemmatization.
arXiv Detail & Related papers (2025-10-08T18:34:00Z) - Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models [1.175067374181304]
Code-switching, the alternation of languages and scripts within a single utterance, remains a fundamental challenge for multilingual NLP.<n>Most large language models (LLMs) struggle with mixed-language inputs, limited CSW datasets, and evaluation biases.<n>This survey provides the first comprehensive analysis of CSW-aware LLM research.
arXiv Detail & Related papers (2025-10-08T14:04:14Z) - Do LLMs exhibit the same commonsense capabilities across languages? [4.177608674029413]
We introduce MULTICOM, a novel benchmark that extends the COCOTEROS dataset to four languages: English, Spanish, Dutch, and Valencian.<n>The task involves generating a commonsensical sentence that includes a given triplet of words.<n>Results consistently show superior performance in English, with significantly lower performance in less-resourced languages.
arXiv Detail & Related papers (2025-09-08T07:47:00Z) - Few-Shot Multilingual Open-Domain QA from 5 Examples [44.04243892727856]
We introduce a emphfew-shot learning approach to synthesise large-scale multilingual data from large language models (LLMs)<n>Our method begins with large-scale self-supervised pre-training using WikiData, followed by training on high-quality synthetic multilingual data generated by prompting LLMs with few-shot supervision.<n>The final model, textscFsModQA, significantly outperforms existing few-shot and supervised baselines in MLODQA and cross-lingual and monolingual retrieval.
arXiv Detail & Related papers (2025-02-27T03:24:57Z) - LLMic: Romanian Foundation Language Model [76.09455151754062]
We present LLMic, a foundation language model designed specifically for the Romanian Language.<n>We show that fine-tuning LLMic for language translation after the initial pretraining phase outperforms existing solutions in English-to-Romanian translation tasks.
arXiv Detail & Related papers (2025-01-13T22:14:45Z) - Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.<n>Currently, instruction-tuned large language models (LLMs) excel at various English tasks.<n>Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.<n>We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.<n>We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.