Overview of ADoBo at IberLEF 2025: Automatic Detection of Anglicisms in Spanish
- URL: http://arxiv.org/abs/2507.21813v1
- Date: Tue, 29 Jul 2025 13:45:08 GMT
- Title: Overview of ADoBo at IberLEF 2025: Automatic Detection of Anglicisms in Spanish
- Authors: Elena Alvarez-Mellado, Jordi Porta-Zamorano, Constantine Lignos, Julio Gonzalo,
- Abstract summary: This paper summarizes the main findings of ADoBo 2025, the shared task on anglicism identification in Spanish proposed in the context of IberLEF 2025.<n>Five teams submitted their solutions for the test phase.<n>Proposed systems included LLMs, deep learning models, Transformer-based models and rule-based systems.
- Score: 6.645406082457186
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper summarizes the main findings of ADoBo 2025, the shared task on anglicism identification in Spanish proposed in the context of IberLEF 2025. Participants of ADoBo 2025 were asked to detect English lexical borrowings (or anglicisms) from a collection of Spanish journalistic texts. Five teams submitted their solutions for the test phase. Proposed systems included LLMs, deep learning models, Transformer-based models and rule-based systems. The results range from F1 scores of 0.17 to 0.99, which showcases the variability in performance different systems can have for this task.
Related papers
- QU-NLP at CheckThat! 2025: Multilingual Subjectivity in News Articles Detection using Feature-Augmented Transformer Models with Sequential Cross-Lingual Fine-Tuning [0.21756081703275998]
This paper presents our approach to the CheckThat! 2025 Task 1 on subjectivity detection.<n>We propose a feature-augmented transformer architecture that combines contextual embeddings from pre-trained language models with statistical and linguistic features.<n>We evaluated our system in monolingual, multilingual, and zero-shot settings across multiple languages including English, Arabic, German, Italian, and several unseen languages.
arXiv Detail & Related papers (2025-07-01T13:39:59Z) - Predicting potentially abusive clauses in Chilean terms of services with natural language processing [0.0]
This study addresses the growing concern of information asymmetry in consumer contracts, exacerbated by the proliferation of online services with complex Terms of Service that are rarely even read.<n>We introduce a new methodology and a substantial dataset addressing this gap.<n>We propose a novel annotation scheme with four categories and a total of 20 classes, and apply it on 50 online Terms of Service used in Chile.
arXiv Detail & Related papers (2025-02-02T18:01:39Z) - 1-800-SHARED-TASKS @ NLU of Devanagari Script Languages: Detection of Language, Hate Speech, and Targets using LLMs [0.0]
This paper presents a detailed system description of our entry for the CHiPSAL 2025 shared task.
We focus on language detection, hate speech identification, and target detection in Devanagari script languages.
arXiv Detail & Related papers (2024-11-11T10:34:36Z) - HYBRINFOX at CheckThat! 2024 -- Task 2: Enriching BERT Models with the Expert System VAGO for Subjectivity Detection [0.8083061106940517]
The HYBRINFOX method ranked 1st with a macro F1 score of 0.7442 on the evaluation data.
We explain the principles of our hybrid approach, and outline ways in which the method could be improved for other languages besides English.
arXiv Detail & Related papers (2024-07-04T09:29:19Z) - The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance.
Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes.
We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - Transformer-based Model for Word Level Language Identification in
Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts.
The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z) - Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation
System for the WMT22 Translation Task [49.916963624249355]
This paper describes Tencent AI Lab - Shanghai Jiao Tong University (TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task.
We participate in the general translation task on English$Leftrightarrow$Livonian.
Our system is based on M2M100 with novel techniques that adapt it to the target language pair.
arXiv Detail & Related papers (2022-10-17T04:34:09Z) - RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts.
A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared.
The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z) - Overview of ADoBo 2021: Automatic Detection of Unassimilated Borrowings
in the Spanish Press [8.950918531231158]
This paper summarizes the main findings of the ADoBo 2021 shared task, proposed in the context of IberLef 2021.
In this task, we invited participants to detect lexical borrowings (coming mostly from English) in Spanish newswire texts.
We provided participants with an annotated corpus of lexical borrowings which we split into training, development and test splits.
arXiv Detail & Related papers (2021-10-29T11:07:59Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.