A new approach to calculating BERTScore for automatic assessment of
translation quality
- URL: http://arxiv.org/abs/2203.05598v2
- Date: Mon, 14 Mar 2022 11:34:35 GMT
- Title: A new approach to calculating BERTScore for automatic assessment of
translation quality
- Authors: A.A. Vetrov and E.A. Gorn
- Abstract summary: The study focuses on the applicability of the BERTScore metric to translation quality assessment at the sentence level for English -> Russian direction.
Experiments were performed with a pre-trained Multilingual BERT as well as with a pair of Monolingual BERT models.
It was demonstrated that such transformation helps to prevent mismatching issue and shown that this approach gives better results than using embeddings of the Multilingual model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The study of the applicability of the BERTScore metric was conducted to
translation quality assessment at the sentence level for English -> Russian
direction. Experiments were performed with a pre-trained Multilingual BERT as
well as with a pair of Monolingual BERT models. To align monolingual
embeddings, an orthogonal transformation based on anchor tokens was used. It
was demonstrated that such transformation helps to prevent mismatching issue
and shown that this approach gives better results than using embeddings of the
Multilingual model. To improve the token matching process it is proposed to
combine all incomplete WorkPiece tokens into meaningful words and use simple
averaging of corresponding vectors and to calculate BERTScore based on anchor
tokens only. Such modifications allowed us to achieve a better correlation of
the model predictions with human judgments. In addition to evaluating machine
translation, several versions of human translation were evaluated as well, the
problems of this approach were listed.
Related papers
- A Data Selection Approach for Enhancing Low Resource Machine Translation Using Cross-Lingual Sentence Representations [0.4499833362998489]
This study focuses on the case of English-Marathi language pairs, where existing datasets are notably noisy.
To mitigate the impact of data quality issues, we propose a data filtering approach based on cross-lingual sentence representations.
Results demonstrate a significant improvement in translation quality over the baseline post-filtering with IndicSBERT.
arXiv Detail & Related papers (2024-09-04T13:49:45Z) - BiVert: Bidirectional Vocabulary Evaluation using Relations for Machine
Translation [4.651581292181871]
We propose a bidirectional semantic-based evaluation method designed to assess the sense distance of the translation from the source text.
This approach employs the comprehensive multilingual encyclopedic dictionary BabelNet.
Factual analysis shows a strong correlation between the average evaluation scores generated by our method and the human assessments across various machine translation systems for English-German language pair.
arXiv Detail & Related papers (2024-03-06T08:02:21Z) - Comparison of Pre-trained Language Models for Turkish Address Parsing [0.0]
We focus on Turkish maps data and thoroughly evaluate both multilingual and Turkish based BERT, DistilBERT, ELECTRA and RoBERTa.
We also propose a MultiLayer Perceptron (MLP) for fine-tuning BERT in addition to the standard approach of one-layer fine-tuning.
arXiv Detail & Related papers (2023-06-24T12:09:43Z) - T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification [50.675552118811]
Cross-lingual text classification is typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest.
We propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages.
arXiv Detail & Related papers (2023-06-08T07:33:22Z) - Decomposed Prompting for Machine Translation Between Related Languages
using Large Language Models [55.35106713257871]
We introduce DecoMT, a novel approach of few-shot prompting that decomposes the translation process into a sequence of word chunk translations.
We show that DecoMT outperforms the strong few-shot prompting BLOOM model with an average improvement of 8 chrF++ scores across the examined languages.
arXiv Detail & Related papers (2023-05-22T14:52:47Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level.
We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks.
Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Mismatching-Aware Unsupervised Translation Quality Estimation For
Low-Resource Languages [6.049660810617423]
XLMRScore is a cross-lingual counterpart of BERTScore computed via the XLM-RoBERTa (XLMR) model.
We evaluate the proposed method on four low-resource language pairs of the WMT21 QE shared task.
arXiv Detail & Related papers (2022-07-31T16:23:23Z) - NMTScore: A Multilingual Analysis of Translation-based Text Similarity
Measures [42.46681912294797]
We analyze translation-based similarity measures in the common framework of multilingual NMT.
Compared to baselines such as sentence embeddings, translation-based measures prove competitive in paraphrase identification.
Measures show a relatively high correlation to human judgments.
arXiv Detail & Related papers (2022-04-28T17:57:17Z) - Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond [58.80417796087894]
Cross-lingual adaptation with multilingual pre-trained language models (mPTLMs) mainly consists of two lines of works: zero-shot approach and translation-based approach.
We propose a novel framework to consolidate the zero-shot approach and the translation-based approach for better adaptation performance.
arXiv Detail & Related papers (2020-10-23T13:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.