CardiffNLP at CLEARS-2025: Prompting Large Language Models for Plain Language and Easy-to-Read Text Rewriting
- URL: http://arxiv.org/abs/2508.03240v1
- Date: Tue, 05 Aug 2025 09:16:19 GMT
- Title: CardiffNLP at CLEARS-2025: Prompting Large Language Models for Plain Language and Easy-to-Read Text Rewriting
- Authors: Mutaz Ayesh, Nicolás Gutiérrez-Rolón, Fernando Alva-Manchego,
- Abstract summary: This paper details the CardiffNLP team's contribution to the CLEARS shared task on Spanish text adaptation.<n>We detail our numerous prompt variations, examples, and experimental results.
- Score: 49.4237054647147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper details the CardiffNLP team's contribution to the CLEARS shared task on Spanish text adaptation, hosted by IberLEF 2025. The shared task contained two subtasks and the team submitted to both. Our team took an LLM-prompting approach with different prompt variations. While we initially experimented with LLaMA-3.2, we adopted Gemma-3 for our final submission, and landed third place in Subtask 1 and second place in Subtask 2. We detail our numerous prompt variations, examples, and experimental results.
Related papers
- MSA at SemEval-2025 Task 3: High Quality Weak Labeling and LLM Ensemble Verification for Multilingual Hallucination Detection [0.0]
This paper describes our submission for SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes.<n>The task involves detecting hallucinated spans in text generated by instruction-tuned Large Language Models (LLMs) across multiple languages.<n>Our system ranked 1st in Arabic and Basque, 2nd in German, Swedish, and Finnish, and 3rd in Czech, Farsi, and French.
arXiv Detail & Related papers (2025-05-27T08:26:17Z) - SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - Overview of GUA-SPA at IberLEF 2023: Guarani-Spanish Code Switching
Analysis [5.262834474543783]
We present the first shared task for detecting and analyzing code-switching in Guarani and Spanish, GUA-SPA at IberLEF 2023.
The challenge consisted of three tasks: identifying the language of a token, NER, and a novel task of classifying the way a Spanish span is used in the code-switched context.
arXiv Detail & Related papers (2023-09-12T12:18:18Z) - Enhancing Translation for Indigenous Languages: Experiments with
Multilingual Models [57.10972566048735]
We present the system descriptions for three methods.
We used two multilingual models, namely M2M-100 and mBART50, and one bilingual (one-to-one) -- Helsinki NLP Spanish-English translation model.
We experimented with 11 languages from America and report the setups we used as well as the results we achieved.
arXiv Detail & Related papers (2023-05-27T08:10:40Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches
for news genre, topic and persuasion technique classification [3.503844033591702]
This paper describes our approach for SemEval-2023 Task 3: Detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup.
arXiv Detail & Related papers (2023-03-16T15:54:23Z) - Findings of the WMT 2022 Shared Task on Translation Suggestion [63.457874930232926]
We report the result of the first edition of the WMT shared task on Translation Suggestion.
The task aims to provide alternatives for specific words or phrases given the entire documents generated by machine translation (MT)
It consists two sub-tasks, namely, the naive translation suggestion and translation suggestion with hints.
arXiv Detail & Related papers (2022-11-30T03:48:36Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Comparing Approaches to Dravidian Language Identification [4.284178873394113]
This paper describes the submissions by team HWR to the Dravidian Language Identification (DLI) shared task organized at VarDial 2021 workshop.
The DLI training set includes 16,674 YouTube comments written in Roman script containing code-mixed text with English and one of the three South Dravidian languages: Kannada, Malayalam, and Tamil.
Our results reinforce the idea that deep learning methods are not as competitive in language identification related tasks as they are in many other text classification tasks.
arXiv Detail & Related papers (2021-03-09T16:58:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.