Related papers: Modern Models, Medieval Texts: A POS Tagging Study of Old Occitan

Modern Models, Medieval Texts: A POS Tagging Study of Old Occitan

URL: http://arxiv.org/abs/2503.07827v1
Date: Mon, 10 Mar 2025 20:16:01 GMT
Title: Modern Models, Medieval Texts: A POS Tagging Study of Old Occitan
Authors: Matthias Schöffel, Marinus Wiedner, Esteban Garces Arias, Paula Ruppert, Christian Heumann, Matthias Aßenmacher,
Abstract summary: Large language models (LLMs) have demonstrated remarkable capabilities in natural language processing.<n>This study examines the performance of open-source LLMs in part-of-speech (POS) tagging for Old Occitan.
Score: 0.1979158763744267
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in natural language processing, yet their effectiveness in handling historical languages remains largely unexplored. This study examines the performance of open-source LLMs in part-of-speech (POS) tagging for Old Occitan, a historical language characterized by non-standardized orthography and significant diachronic variation. Through comparative analysis of two distinct corpora-hagiographical and medical texts-we evaluate how current models handle the inherent challenges of processing a low-resource historical language. Our findings demonstrate critical limitations in LLM performance when confronted with extreme orthographic and syntactic variability. We provide detailed error analysis and specific recommendations for improving model performance in historical language processing. This research advances our understanding of LLM capabilities in challenging linguistic contexts while offering practical insights for both computational linguistics and historical language studies.

Related papers

Unveiling Factors for Enhanced POS Tagging: A Study of Low-Resource Medieval Romance Languages [0.18846515534317265]
Part-of-speech (POS) tagging remains a foundational component in natural language processing pipelines.<n>This study systematically investigates the central determinants of POS tagging performance across diverse corpora of Medieval Occitan, Medieval Spanish, and Medieval French texts.
arXiv Detail & Related papers (2025-06-21T13:33:07Z)
Language Surgery in Multilingual Large Language Models [32.77326546076424]
Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages.<n>This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers.<n>We propose Inference-Time Language Control (ITLC) to enable precise cross-lingual language control and mitigate language confusion.
arXiv Detail & Related papers (2025-06-14T11:09:50Z)
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models [55.14276067678253]
This paper introduces a novel methodology for efficiently identifying inherent cross-lingual weaknesses in Large Language Models (LLMs)<n>We construct a new dataset of over 6,000 bilingual pairs across 16 languages using this methodology, demonstrating its effectiveness in revealing weaknesses even in state-of-the-art models.<n>Further experiments investigate the relationship between linguistic similarity and cross-lingual weaknesses, revealing that linguistically related languages share similar performance patterns.
arXiv Detail & Related papers (2025-05-24T12:31:27Z)
Comparing LLM Text Annotation Skills: A Study on Human Rights Violations in Social Media Data [2.812898346527047]
This study investigates the capabilities of large language models (LLMs) for zero-shot and few-shot annotation of social media posts in Russian and Ukrainian.<n>To evaluate the effectiveness of these models, their annotations are compared against a gold standard set of human double-annotated labels.<n>The study explores the unique patterns of errors and disagreements exhibited by each model, offering insights into their strengths, limitations, and cross-linguistic adaptability.
arXiv Detail & Related papers (2025-05-15T13:10:47Z)
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation [1.2576388595811496]
We introduce a framework for producing linguistic reasoning problems that reduces the effect of memorisation in model performance estimates.<n>We apply this framework to develop LINGOLY-TOO, a challenging benchmark for linguistic reasoning.
arXiv Detail & Related papers (2025-03-04T19:57:47Z)
Exploring Robustness of LLMs to Sociodemographically-Conditioned Paraphrasing [7.312170216336085]
We take a broader approach to explore a wider range of variations across sociodemographic dimensions.<n>We extend the SocialIQA dataset to create diverse paraphrased sets conditioned on sociodemographic styles.<n>We find that demographic-specific paraphrasing significantly impacts the performance of language models.
arXiv Detail & Related papers (2025-01-14T17:50:06Z)
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition [6.617999710257379]
We propose a three-stage framework to assess the abilities of LMs. We evaluate the generative capacities of LMs using methods from linguistic research.
arXiv Detail & Related papers (2024-10-17T06:31:49Z)
Predicting Emotion Intensity in Polish Political Texts: Comparing Supervised Models and Large Language Models in a Resource-Poor Language [0.0]
This study explores the use of large language models (LLMs) to predict emotion intensity in Polish political texts. The research compares the performance of several LLMs against a supervised model trained on an annotated corpus of 10,000 social media texts.
arXiv Detail & Related papers (2024-07-16T19:53:14Z)
Lessons from the Trenches on Reproducible Evaluation of Language Models [60.522749986793094]
We draw on three years of experience in evaluating large language models to provide guidance and lessons for researchers. We present the Language Model Evaluation Harness (lm-eval), an open source library for independent, reproducible, and evaluation of language models.
arXiv Detail & Related papers (2024-05-23T16:50:49Z)
Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets. This survey delves into an important attribute of these datasets: the dialect of a language. Motivated by the performance degradation of NLP models for dialectal datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z)
Exploring the Potential of Large Language Models in Computational Argumentation [54.85665903448207]
Large language models (LLMs) have demonstrated impressive capabilities in understanding context and generating natural language. This work aims to embark on an assessment of LLMs, such as ChatGPT, Flan models, and LLaMA2 models, in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-11-15T15:12:15Z)
Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z)
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs) We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods. In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z)
Shortcut Learning of Large Language Models in Natural Language Understanding [119.45683008451698]
Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks. They might rely on dataset bias and artifacts as shortcuts for prediction. This has significantly affected their generalizability and adversarial robustness.
arXiv Detail & Related papers (2022-08-25T03:51:39Z)
Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding [1.827510863075184]
Curriculum is a new format of NLI benchmark for evaluation of broad-coverage linguistic phenomena. We show that this linguistic-phenomena-driven benchmark can serve as an effective tool for diagnosing model behavior and verifying model learning quality.
arXiv Detail & Related papers (2022-04-13T10:32:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.