Incorporate Semantic Structures into Machine Translation Evaluation via
UCCA
- URL: http://arxiv.org/abs/2010.08728v2
- Date: Thu, 22 Oct 2020 03:38:19 GMT
- Title: Incorporate Semantic Structures into Machine Translation Evaluation via
UCCA
- Authors: Jin Xu, Yinuo Guo, Junfeng Hu
- Abstract summary: We define words carrying important semantic meanings in sentences as semantic core words.
We propose an MT evaluation approach named Semantically Weighted Sentence Similarity (SWSS)
- Score: 9.064153799336536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Copying mechanism has been commonly used in neural paraphrasing networks and
other text generation tasks, in which some important words in the input
sequence are preserved in the output sequence. Similarly, in machine
translation, we notice that there are certain words or phrases appearing in all
good translations of one source text, and these words tend to convey important
semantic information. Therefore, in this work, we define words carrying
important semantic meanings in sentences as semantic core words. Moreover, we
propose an MT evaluation approach named Semantically Weighted Sentence
Similarity (SWSS). It leverages the power of UCCA to identify semantic core
words, and then calculates sentence similarity scores on the overlap of
semantic core words. Experimental results show that SWSS can consistently
improve the performance of popular MT evaluation metrics which are based on
lexical similarity.
Related papers
- Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective [50.261681681643076]
We propose a novel metric called SemVarEffect and a benchmark named SemVarBench to evaluate the causality between semantic variations in inputs and outputs in text-to-image synthesis.
Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.
arXiv Detail & Related papers (2024-10-14T08:45:35Z) - Predicting Word Similarity in Context with Referential Translation Machines [0.0]
We identify the similarity between two words in English by casting the task as machine translation performance prediction (MTPP)
We use referential translation machines (RTMs) which allows a common representation of training and test sets.
RTMs can achieve the top results in Graded Word Similarity in Context (GWSC) task.
arXiv Detail & Related papers (2024-07-07T09:36:41Z) - Evaluation of Machine Translation Based on Semantic Dependencies and Keywords [7.240399904675839]
This paper proposes a computational method for evaluating the semantic correctness of machine translations based on reference translations.
Use the language technology platform developed by the Social Computing and Information Retrieval Research Center of Harbin Institute of Technology.
arXiv Detail & Related papers (2024-04-20T04:14:28Z) - A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional
semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings.
We derive new distributional semantic similarity measures for M-SE from prior ones.
We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z) - SST-BERT at SemEval-2020 Task 1: Semantic Shift Tracing by Clustering in
BERT-based Embedding Spaces [63.17308641484404]
We propose to identify clusters among different occurrences of each target word, considering these as representatives of different word meanings.
Disagreements in obtained clusters naturally allow to quantify the level of semantic shift per each target word in four target languages.
Our approach performs well both measured separately (per language) and overall, where we surpass all provided SemEval baselines.
arXiv Detail & Related papers (2020-10-02T08:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.