IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing
Interactive Machine Translation Systems
- URL: http://arxiv.org/abs/2310.11163v1
- Date: Tue, 17 Oct 2023 11:29:04 GMT
- Title: IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing
Interactive Machine Translation Systems
- Authors: Xu Huang, Zhirui Zhang, Ruize Gao, Yichao Du, Lemao Liu, Gouping
Huang, Shuming Shi, Jiajun Chen, Shujian Huang
- Abstract summary: We present IMTLab, an open-source end-to-end interactive machine translation (IMT) system platform.
IMTLab treats the whole interactive translation process as a task-oriented dialogue with a human-in-the-loop setting.
- Score: 94.39110258587887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present IMTLab, an open-source end-to-end interactive machine translation
(IMT) system platform that enables researchers to quickly build IMT systems
with state-of-the-art models, perform an end-to-end evaluation, and diagnose
the weakness of systems. IMTLab treats the whole interactive translation
process as a task-oriented dialogue with a human-in-the-loop setting, in which
human interventions can be explicitly incorporated to produce high-quality,
error-free translations. To this end, a general communication interface is
designed to support the flexible IMT architectures and user policies. Based on
the proposed design, we construct a simulated and real interactive environment
to achieve end-to-end evaluation and leverage the framework to systematically
evaluate previous IMT systems. Our simulated and manual experiments show that
the prefix-constrained decoding approach still gains the lowest editing cost in
the end-to-end evaluation, while BiTIIMT achieves comparable editing cost with
a better interactive experience.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models [16.94819621353007]
SynTOD is a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) systems.
It generates diverse, structured conversations through random walks and response simulation using large language models.
In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance.
arXiv Detail & Related papers (2024-04-23T06:23:34Z) - Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding [73.32763904267186]
Large Language Models (LLMs) present the potential for achieving superior translation quality.
We propose Cooperative Decoding (CoDec) which treats NMT systems as a pretranslation model and MT-oriented LLMs as a supplemental solution.
arXiv Detail & Related papers (2023-11-06T03:41:57Z) - Using Textual Interface to Align External Knowledge for End-to-End
Task-Oriented Dialogue Systems [53.38517204698343]
We propose a novel paradigm that uses a textual interface to align external knowledge and eliminate redundant processes.
We demonstrate our paradigm in practice through MultiWOZ-Remake, including an interactive textual interface built for the MultiWOZ database.
arXiv Detail & Related papers (2023-05-23T05:48:21Z) - Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models [57.80514758695275]
Using large language models (LLMs) for assessing the quality of machine translation (MT) achieves state-of-the-art performance at the system level.
We propose a new prompting method called textbftextttError Analysis Prompting (EAPrompt)
This technique emulates the commonly accepted human evaluation framework - Multidimensional Quality Metrics (MQM) and textitproduces explainable and reliable MT evaluations at both the system and segment level.
arXiv Detail & Related papers (2023-03-24T05:05:03Z) - Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with
User Simulator [37.590563896382456]
We propose an interactive evaluation framework for Task-Oriented Dialogue (TOD) systems.
We first build a goal-oriented user simulator based on pre-trained models and then use the user simulator to interact with the dialogue system to generate dialogues.
Experimental results show that RL-based TOD systems trained by our proposed user simulator can achieve nearly 98% inform and success rates.
arXiv Detail & Related papers (2022-10-26T07:41:32Z) - Refining the state-of-the-art in Machine Translation, optimizing NMT for
the JA <-> EN language pair by leveraging personal domain expertise [0.0]
Documenting the construction of an NMT (Neural Machine Translation) system for En/Ja based on the Transformer architecture leveraging the OpenNMT framework.
System is evaluated using standard auto-evaluation metrics such as BLEU, and my subjective opinion as a Japanese linguist.
arXiv Detail & Related papers (2022-02-23T18:20:14Z) - Evaluating MT Systems: A Theoretical Framework [0.0]
This paper outlines a theoretical framework using which different automatic metrics can be designed for evaluation of Machine Translation systems.
It introduces the concept of em cognitive ease which depends on em adequacy and em lack of fluency.
It can also be used to evaluate the newer types of MT systems, such as speech to speech translation and discourse translation.
arXiv Detail & Related papers (2022-02-11T18:05:17Z) - Unsupervised Quality Estimation for Neural Machine Translation [63.38918378182266]
Existing approaches require large amounts of expert annotated data, computation and time for training.
We devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required.
We achieve very good correlation with human judgments of quality, rivalling state-of-the-art supervised QE models.
arXiv Detail & Related papers (2020-05-21T12:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.