Findings of the WMT 2024 Shared Task on Chat Translation
- URL: http://arxiv.org/abs/2410.11624v1
- Date: Tue, 15 Oct 2024 14:13:17 GMT
- Title: Findings of the WMT 2024 Shared Task on Chat Translation
- Authors: Wafaa Mohammed, Sweta Agrawal, M. Amin Farajian, Vera Cabarrão, Bryan Eikema, Ana C. Farinha, José G. C. de Souza,
- Abstract summary: This paper presents the findings from the third edition of the Chat Translation Shared Task.
The task involved translating bilingual customer support conversations, specifically focusing on the impact of conversation context in translation quality and evaluation.
We received 22 primary submissions and 32 contrastive submissions from eight teams, with each language pair having participation from at least three teams.
- Score: 4.800626318046925
- License:
- Abstract: This paper presents the findings from the third edition of the Chat Translation Shared Task. As with previous editions, the task involved translating bilingual customer support conversations, specifically focusing on the impact of conversation context in translation quality and evaluation. We also include two new language pairs: English-Korean and English-Dutch, in addition to the set of language pairs from previous editions: English-German, English-French, and English-Brazilian Portuguese. We received 22 primary submissions and 32 contrastive submissions from eight teams, with each language pair having participation from at least three teams. We evaluated the systems comprehensively using both automatic metrics and human judgments via a direct assessment framework. The official rankings for each language pair were determined based on human evaluation scores, considering performance in both translation directions--agent and customer. Our analysis shows that while the systems excelled at translating individual turns, there is room for improvement in overall conversation-level translation quality.
Related papers
- GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025.
The task consists of two subtasks: Monolingual (English) and Multilingual.
We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z) - Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation [75.03292732779059]
We focus on three language directions: Chinese-English, Chinese-German, and Chinese-Russian.
This year, we totally received 10 submissions from 5 academia and industry teams.
The official ranking of the systems is based on the overall human judgments.
arXiv Detail & Related papers (2024-12-16T12:54:52Z) - Findings of the IWSLT 2024 Evaluation Campaign [102.7608597658451]
The paper reports on the shared tasks organized by the 21st IWSLT Conference.
The shared tasks address 7 scientific challenges in spoken language translation.
arXiv Detail & Related papers (2024-11-07T19:11:55Z) - Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation [50.60733773088296]
We conduct a comprehensive human evaluation of the results of several shared tasks from the last International Workshop on Spoken Language Translation (IWSLT 2023)
We propose an effective evaluation strategy based on automatic resegmentation and direct assessment with segment context.
Our analysis revealed that: 1) the proposed evaluation strategy is robust and scores well-correlated with other types of human judgements; 2) automatic metrics are usually, but not always, well-correlated with direct assessment scores; and 3) COMET as a slightly stronger automatic metric than chrF.
arXiv Detail & Related papers (2024-06-06T09:18:42Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - Advancing Multilingual Pre-training: TRIP Triangular Document-level
Pre-training for Multilingual Language Models [107.83158521848372]
We present textbfTriangular Document-level textbfPre-training (textbfTRIP), which is the first in the field to accelerate the conventional monolingual and bilingual objectives into a trilingual objective with a novel method called Grafting.
TRIP achieves several strong state-of-the-art (SOTA) scores on three multilingual document-level machine translation benchmarks and one cross-lingual abstractive summarization benchmark, including consistent improvements by up to 3.11 d-BLEU points and 8.9 ROUGE-L points.
arXiv Detail & Related papers (2022-12-15T12:14:25Z) - Consistent Human Evaluation of Machine Translation across Language Pairs [21.81895199744468]
We propose a new metric called XSTS that is more focused on semantic equivalence and a cross-lingual calibration method.
We demonstrate the effectiveness of these novel contributions in large scale evaluation studies across up to 14 language pairs.
arXiv Detail & Related papers (2022-05-17T17:57:06Z) - Ensemble Fine-tuned mBERT for Translation Quality Estimation [0.0]
In this paper, we discuss our submission to the WMT 2021 QE Shared Task.
Our proposed system is an ensemble of multilingual BERT (mBERT)-based regression models.
It demonstrates comparable performance with respect to the Pearson's correlation and beats the baseline system in MAE/ RMSE for several language pairs.
arXiv Detail & Related papers (2021-09-08T20:13:06Z) - FST: the FAIR Speech Translation System for the IWSLT21 Multilingual
Shared Task [36.51221186190272]
We describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign.
Our system is built by leveraging transfer learning across modalities, tasks and languages.
arXiv Detail & Related papers (2021-07-14T19:43:44Z) - Lost in Interpreting: Speech Translation from Source or Interpreter? [0.0]
We release 10 hours of recordings and transcripts of European Parliament speeches in English, with simultaneous interpreting into Czech and German.
We evaluate quality and latency of speaker-based and interpreter-based spoken translation systems from English to Czech.
arXiv Detail & Related papers (2021-06-17T09:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.