Incorporating Terminology Constraints in Automatic Post-Editing
- URL: http://arxiv.org/abs/2010.09608v1
- Date: Mon, 19 Oct 2020 15:44:03 GMT
- Title: Incorporating Terminology Constraints in Automatic Post-Editing
- Authors: David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat and Kathleen
McKeown
- Abstract summary: We present both autoregressive and non-autoregressive models for lexically constrained APE.
Our approach enables preservation of 95% of the terminologies and also improves translation quality on English-German benchmarks.
- Score: 23.304864678067865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Users of machine translation (MT) may want to ensure the use of specific
lexical terminologies. While there exist techniques for incorporating
terminology constraints during inference for MT, current APE approaches cannot
ensure that they will appear in the final translation. In this paper, we
present both autoregressive and non-autoregressive models for lexically
constrained APE, demonstrating that our approach enables preservation of 95% of
the terminologies and also improves translation quality on English-German
benchmarks. Even when applied to lexically constrained MT output, our approach
is able to improve preservation of the terminologies. However, we show that our
models do not learn to copy constraints systematically and suggest a simple
data augmentation technique that leads to improved performance and robustness.
Related papers
- Efficient Terminology Integration for LLM-based Translation in Specialized Domains [0.0]
In specialized fields such as patent, finance, or biomedical domains, terminology is crucial for translation.
We introduce a methodology that efficiently trains models with a smaller amount of data while preserving the accuracy of terminology translation.
This methodology enhances the model's ability to handle specialized terminology and ensures high-quality translations.
arXiv Detail & Related papers (2024-10-21T07:01:25Z) - Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control.
We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z) - Efficient Technical Term Translation: A Knowledge Distillation Approach for Parenthetical Terminology Translation [0.0]
This paper addresses the challenge of accurately translating technical terms, which are crucial for clear communication in specialized fields.
We introduce the Parenthetical Terminology Translation (PTT) task, designed to mitigate potential inaccuracies by displaying the original term in parentheses alongside its translation.
We developed a novel evaluation metric to assess both overall translation accuracy and the correct parenthetical presentation of terms.
arXiv Detail & Related papers (2024-10-01T13:40:28Z) - An Analysis of BPE Vocabulary Trimming in Neural Machine Translation [56.383793805299234]
vocabulary trimming is a postprocessing step that replaces rare subwords with their component subwords.
We show that vocabulary trimming fails to improve performance and is even prone to incurring heavy degradation.
arXiv Detail & Related papers (2024-03-30T15:29:49Z) - Terminology-Aware Translation with Constrained Decoding and Large
Language Model Prompting [11.264272119913311]
We submit to the WMT 2023 terminology translation task.
We adopt a translate-then-refine approach which can be domain-independent and requires minimal manual efforts.
Results show that our terminology-aware model learns to incorporate terminologies effectively.
arXiv Detail & Related papers (2023-10-09T16:08:23Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Dictionary-based Phrase-level Prompting of Large Language Models for
Machine Translation [91.57514888410205]
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting.
LLMs can struggle to translate inputs with rare words, which are common in low resource or domain transfer scenarios.
We show that LLM prompting can provide an effective solution for rare words as well, by using prior knowledge from bilingual dictionaries to provide control hints in the prompts.
arXiv Detail & Related papers (2023-02-15T18:46:42Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z) - Rule-based Morphological Inflection Improves Neural Terminology
Translation [16.802947102163497]
We introduce a modular framework for incorporating lemma constraints in neural MT (NMT)
It is based on a novel cross-lingual inflection module that inflects the target lemma constraints based on the source context.
Results show that our rule-based inflection module helps NMT models incorporate lemma constraints more accurately than a neural module and outperforms the existing end-to-end approach with lower training costs.
arXiv Detail & Related papers (2021-09-10T02:06:48Z) - A Probabilistic Formulation of Unsupervised Text Style Transfer [128.80213211598752]
We present a deep generative model for unsupervised text style transfer that unifies previously proposed non-generative techniques.
By hypothesizing a parallel latent sequence that generates each observed sequence, our model learns to transform sequences from one domain to another in a completely unsupervised fashion.
arXiv Detail & Related papers (2020-02-10T16:20:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.