Related papers: Machine Translation Impact in E-commerce Multilingual Search

Machine Translation Impact in E-commerce Multilingual Search

URL: http://arxiv.org/abs/2302.00119v1
Date: Tue, 31 Jan 2023 21:59:35 GMT
Title: Machine Translation Impact in E-commerce Multilingual Search
Authors: Bryan Zhang, Amita Misra
Abstract summary: Cross-lingual information retrieval correlates highly with the quality of Machine Translation. There may be a threshold beyond which improving query translation quality yields little or no benefit to further improve the retrieval performance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous work suggests that performance of cross-lingual information retrieval correlates highly with the quality of Machine Translation. However, there may be a threshold beyond which improving query translation quality yields little or no benefit to further improve the retrieval performance. This threshold may depend upon multiple factors including the source and target languages, the existing MT system quality and the search pipeline. In order to identify the benefit of improving an MT system for a given search pipeline, we investigate the sensitivity of retrieval quality to the presence of different levels of MT quality using experimental datasets collected from actual traffic. We systematically improve the performance of our MT systems quality on language pairs as measured by MT evaluation metrics including Bleu and Chrf to determine their impact on search precision metrics and extract signals that help to guide the improvement strategies. Using this information we develop techniques to compare query translations for multiple language pairs and identify the most promising language pairs to invest and improve.

Related papers

Contextual Cues in Machine Translation: Investigating the Potential of Multi-Source Input Strategies in LLMs and NMT Systems [2.512491726995032]
We compare GPT-4o, a large language model, with a traditional multilingual neural machine translation (NMT) system. Using intermediate language translations as contextual cues, we evaluate their effectiveness in enhancing English and Chinese translations into Portuguese. Results suggest that contextual information significantly improves translation quality for domain-specific datasets and potentially for linguistically distant language pairs.
arXiv Detail & Related papers (2025-03-10T11:23:44Z)
MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation [1.7775825387442485]
MT-LENS is a framework designed to evaluate Machine Translation (MT) systems across a variety of tasks. It offers a user-friendly platform to compare systems and analyze translations with interactive visualizations.
arXiv Detail & Related papers (2024-12-16T09:57:28Z)
Evaluating Automatic Metrics with Incremental Machine Translation Systems [55.78547133890403]
We introduce a dataset comprising commercial machine translations, gathered weekly over six years across 12 translation directions. We assume commercial systems improve over time, which enables us to evaluate machine translation (MT) metrics based on their preference for more recent translations.
arXiv Detail & Related papers (2024-07-03T17:04:17Z)
Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation [0.846600473226587]
This paper presents a novel methodology for in-context learning (ICL) that relies on a search algorithm guided by domain-specific quality estimation (QE) Our results demonstrate significant improvements over existing ICL methods and higher translation performance compared to fine-tuning a pre-trained language model (PLM)
arXiv Detail & Related papers (2024-06-12T07:49:36Z)
Machine Translation Meta Evaluation through Translation Accuracy Challenge Sets [92.38654521870444]
We introduce ACES, a contrastive challenge set spanning 146 language pairs. This dataset aims to discover whether metrics can identify 68 translation accuracy errors. We conduct a large-scale study by benchmarking ACES on 50 metrics submitted to the WMT 2022 and 2023 metrics shared tasks.
arXiv Detail & Related papers (2024-01-29T17:17:42Z)
Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation [64.5862977630713]
This study investigates how Large Language Models (LLMs) leverage source and reference data in machine translation evaluation task. We find that reference information significantly enhances the evaluation accuracy, while surprisingly, source information sometimes is counterproductive.
arXiv Detail & Related papers (2024-01-12T13:23:21Z)
Quality Estimation of Machine Translated Texts based on Direct Evidence from Training Data [0.0]
We show that the parallel corpus used as training data for training the MT system holds direct clues for estimating the quality of translations produced by the MT system. Our experiments show that this simple and direct method holds promise for quality estimation of translations produced by any purely data driven machine translation system.
arXiv Detail & Related papers (2023-06-27T11:52:28Z)
Translation-Enhanced Multilingual Text-to-Image Generation [61.41730893884428]
Research on text-to-image generation (TTI) still predominantly focuses on the English language. In this work, we thus investigate multilingual TTI and the current potential of neural machine translation (NMT) to bootstrap mTTI systems. We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework.
arXiv Detail & Related papers (2023-05-30T17:03:52Z)
Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component. We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z)
Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level. We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks. Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z)
Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation [8.554761233491236]
We analyse the impact that data translated with rule-based, phrase-based statistical and neural MT systems has on new MT systems. We exploit different data selection strategies in order to reduce the amount of data used, while at the same time maintaining high-quality MT systems.
arXiv Detail & Related papers (2020-05-01T10:50:53Z)
Can Your Context-Aware MT System Pass the DiP Benchmark Tests? : Evaluation Benchmarks for Discourse Phenomena in Machine Translation [7.993547048820065]
We introduce the first of their kind MT benchmark datasets that aim to track and hail improvements across four main discourse phenomena. Surprisingly, we find that existing context-aware models do not improve discourse-related translations consistently across languages and phenomena.
arXiv Detail & Related papers (2020-04-30T07:15:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.