Predicting Word Similarity in Context with Referential Translation Machines
- URL: http://arxiv.org/abs/2407.06230v1
- Date: Sun, 7 Jul 2024 09:36:41 GMT
- Title: Predicting Word Similarity in Context with Referential Translation Machines
- Authors: Ergun Biçici,
- Abstract summary: We identify the similarity between two words in English by casting the task as machine translation performance prediction (MTPP)
We use referential translation machines (RTMs) which allows a common representation of training and test sets.
RTMs can achieve the top results in Graded Word Similarity in Context (GWSC) task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We identify the similarity between two words in English by casting the task as machine translation performance prediction (MTPP) between the words given the context and the distance between their similarities. We use referential translation machines (RTMs), which allows a common representation for training and test sets and stacked machine learning models. RTMs can achieve the top results in Graded Word Similarity in Context (GWSC) task.
Related papers
- Identifying Intensity of the Structure and Content in Tweets and the Discriminative Power of Attributes in Context with Referential Translation Machines [0.0]
We use referential translation machines (RTMs) to identify the similarity between an attribute and two words in English.
RTMs are also used to predict the intensity of the structure and content in tweets in English, Arabic, and Spanish.
arXiv Detail & Related papers (2024-07-06T18:58:10Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - Improving Contextual Representation with Gloss Regularized Pre-training [9.589252392388758]
We propose an auxiliary gloss regularizer module to BERT pre-training (GR-BERT) to enhance word semantic similarity.
By predicting masked words and aligning contextual embeddings to corresponding glosses simultaneously, the word similarity can be explicitly modeled.
Experimental results show that the gloss regularizer benefits BERT in word-level and sentence-level semantic representation.
arXiv Detail & Related papers (2022-05-13T12:50:32Z) - Toward Interpretable Semantic Textual Similarity via Optimal
Transport-based Contrastive Sentence Learning [29.462788855992617]
We describe the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem.
We then present the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs.
In the end, we propose CLRCMD, a contrastive learning framework that optimize RCMD of sentence pairs.
arXiv Detail & Related papers (2022-02-26T17:28:02Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - SemMT: A Semantic-based Testing Approach for Machine Translation Systems [11.166336490280749]
We propose SemMT, an automatic testing approach for machine translation systems based on semantic similarity checking.
SemMT applies round-trip translation and measures the semantic similarity between the original and translated sentences.
We show SemMT can achieve higher effectiveness compared with state-of-the-art works.
arXiv Detail & Related papers (2020-12-03T10:42:56Z) - Incorporate Semantic Structures into Machine Translation Evaluation via
UCCA [9.064153799336536]
We define words carrying important semantic meanings in sentences as semantic core words.
We propose an MT evaluation approach named Semantically Weighted Sentence Similarity (SWSS)
arXiv Detail & Related papers (2020-10-17T06:47:58Z) - Contextual Neural Machine Translation Improves Translation of Cataphoric
Pronouns [50.245845110446496]
We investigate the effect of future sentences as context by comparing the performance of a contextual NMT model trained with the future context to the one trained with the past context.
Our experiments and evaluation, using generic and pronoun-focused automatic metrics, show that the use of future context achieves significant improvements over the context-agnostic Transformer.
arXiv Detail & Related papers (2020-04-21T10:45:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.