HausaNLP at SemEval-2025 Task 2: Entity-Aware Fine-tuning vs. Prompt Engineering in Entity-Aware Machine Translation
- URL: http://arxiv.org/abs/2503.19702v1
- Date: Tue, 25 Mar 2025 14:29:43 GMT
- Title: HausaNLP at SemEval-2025 Task 2: Entity-Aware Fine-tuning vs. Prompt Engineering in Entity-Aware Machine Translation
- Authors: Abdulhamid Abubakar, Hamidatu Abdulkadir, Ibrahim Rabiu Abdullahi, Abubakar Auwal Khalid, Ahmad Mustapha Wali, Amina Aminu Umar, Maryam Bala, Sani Abdullahi Sani, Ibrahim Said Ahmad, Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Vukosi Marivate,
- Abstract summary: This paper presents our findings for SemEval 2025 Task 2, a shared task on entity-aware machine translation (EA-MT)<n>The goal of this task is to develop translation models that can accurately translate English sentences into target languages.<n>In this paper, we describe the different systems we employed, detail our results, and discuss insights gained from our experiments.
- Score: 2.17880235420183
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents our findings for SemEval 2025 Task 2, a shared task on entity-aware machine translation (EA-MT). The goal of this task is to develop translation models that can accurately translate English sentences into target languages, with a particular focus on handling named entities, which often pose challenges for MT systems. The task covers 10 target languages with English as the source. In this paper, we describe the different systems we employed, detail our results, and discuss insights gained from our experiments.
Related papers
- GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025.<n>The task consists of two subtasks: Monolingual (English) and Multilingual.<n>We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z) - AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness [16.896143197472114]
This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian languages.
We propose using machine translation for data augmentation to address the low-resource challenge of limited training data.
We achieve competitive results in the shared task: our system performs the best among all ranked teams in both subtask A (supervised learning) and subtask C (cross-lingual transfer)
arXiv Detail & Related papers (2024-04-01T21:21:15Z) - UvA-MT's Participation in the WMT23 General Translation Shared Task [7.4336950563281174]
This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.
We show that by using one model to handle bidirectional tasks, it is possible to achieve comparable results with that of traditional bilingual translation for both directions.
arXiv Detail & Related papers (2023-10-15T20:49:31Z) - Findings of the WMT 2022 Shared Task on Translation Suggestion [63.457874930232926]
We report the result of the first edition of the WMT shared task on Translation Suggestion.
The task aims to provide alternatives for specific words or phrases given the entire documents generated by machine translation (MT)
It consists two sub-tasks, namely, the naive translation suggestion and translation suggestion with hints.
arXiv Detail & Related papers (2022-11-30T03:48:36Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - The Effect of Normalization for Bi-directional Amharic-English Neural
Machine Translation [53.907805815477126]
This paper presents the first relatively large-scale Amharic-English parallel sentence dataset.
We build bi-directional Amharic-English translation models by fine-tuning the existing Facebook M2M100 pre-trained model.
The results show that the normalization of Amharic homophone characters increases the performance of Amharic-English machine translation in both directions.
arXiv Detail & Related papers (2022-10-27T07:18:53Z) - Building Machine Translation Systems for the Next Thousand Languages [102.24310122155073]
We describe results in three research domains: building clean, web-mined datasets for 1500+ languages, developing practical MT models for under-served languages, and studying the limitations of evaluation metrics for these languages.
We hope that our work provides useful insights to practitioners working towards building MT systems for currently understudied languages, and highlights research directions that can complement the weaknesses of massively multilingual models in data-sparse settings.
arXiv Detail & Related papers (2022-05-09T00:24:13Z) - FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
Shared Task [36.51221186190272]
We describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign.
Our system is built by leveraging transfer learning across modalities, tasks and languages.
arXiv Detail & Related papers (2021-07-14T19:43:44Z) - SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
Systems for the WMT20 News Translation Task [111.91077204077817]
We participated in four translation directions of three language pairs: English-Chinese, English-Polish, and German-Upper Sorbian.
Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques.
In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
arXiv Detail & Related papers (2020-10-11T00:40:05Z) - UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social
Media Texts using Transformers and Multi-Task Learning [1.7196613099537055]
We describe the systems developed by our team for SemEval-2020 Task 9.
We aim to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
Our approach achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850.
For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants.
arXiv Detail & Related papers (2020-09-06T17:19:18Z) - DAVE: Deriving Automatically Verilog from English [18.018512051180714]
Engineers undertake significant efforts to translate specifications for digital systems into programming languages.
We explore the use of state-of-the-art machine learning (ML) to automatically derive Verilog snippets from English.
arXiv Detail & Related papers (2020-08-27T15:25:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.