Discourse Cohesion Evaluation for Document-Level Neural Machine
Translation
- URL: http://arxiv.org/abs/2208.09118v1
- Date: Fri, 19 Aug 2022 01:56:00 GMT
- Title: Discourse Cohesion Evaluation for Document-Level Neural Machine
Translation
- Authors: Xin Tan and Longyin Zhang and Guodong Zhou
- Abstract summary: It is well known that translations generated by an excellent document-level neural machine translation (NMT) model are consistent and coherent.
Existing sentence-level evaluation metrics like BLEU can hardly reflect the model's performance at the document level.
We propose a new test suite that considers four cohesive manners to measure the cohesiveness of document translations.
- Score: 36.96887050831173
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well known that translations generated by an excellent document-level
neural machine translation (NMT) model are consistent and coherent. However,
existing sentence-level evaluation metrics like BLEU can hardly reflect the
model's performance at the document level. To tackle this issue, we propose a
Discourse Cohesion Evaluation Method (DCoEM) in this paper and contribute a new
test suite that considers four cohesive manners (reference, conjunction,
substitution, and lexical cohesion) to measure the cohesiveness of document
translations. The evaluation results on recent document-level NMT systems show
that our method is practical and essential in estimating translations at the
document level.
Related papers
- Evaluating Optimal Reference Translations [4.956416618428049]
We propose a methodology for creating more reliable document-level human reference translations.
We evaluate the obtained document-level optimal reference translations in comparison with "standard" ones.
arXiv Detail & Related papers (2023-11-28T13:50:50Z) - Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine
Translation Assessment [20.63045120292095]
Cross-lingual Machine Translation (MT) quality estimation plays a crucial role in evaluating translation performance.
GEMBA, the first MT quality assessment metric based on Large Language Models (LLMs), employs one-step prompting to achieve state-of-the-art (SOTA) in system-level MT quality estimation.
In this paper, we introduce Knowledge-Prompted Estor (KPE), a CoT prompting method that combines three one-step prompting techniques, including perplexity, token-level similarity, and sentence-level similarity.
arXiv Detail & Related papers (2023-06-13T01:18:32Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - A Comparison of Approaches to Document-level Machine Translation [34.2276281264886]
This paper presents a systematic comparison of selected approaches to document-level phenomena evaluation suites.
We find that a simple method based purely on back-translating monolingual document-level data performs as well as much more elaborate alternatives.
arXiv Detail & Related papers (2021-01-26T19:21:09Z) - Diving Deep into Context-Aware Neural Machine Translation [36.17847243492193]
This paper analyzes the performance of document-level NMT models on four diverse domains.
We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks.
arXiv Detail & Related papers (2020-10-19T13:23:12Z) - Rethinking Document-level Neural Machine Translation [73.42052953710605]
We try to answer the question: Is the capacity of current models strong enough for document-level translation?
We observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words.
arXiv Detail & Related papers (2020-10-18T11:18:29Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.