Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation
- URL: http://arxiv.org/abs/2210.09773v1
- Date: Tue, 18 Oct 2022 11:37:36 GMT
- Title: Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation
- Authors: Deng Cai and Xin Li and Jackie Chun-Sing Ho and Lidong Bing and Wai
Lam
- Abstract summary: We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
- Score: 70.58243648754507
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a new method to improve existing multilingual sentence
embeddings with Abstract Meaning Representation (AMR). Compared with the
original textual input, AMR is a structured semantic representation that
presents the core concepts and relations in a sentence explicitly and
unambiguously. It also helps reduce surface variations across different
expressions and languages. Unlike most prior work that only evaluates the
ability to measure semantic similarity, we present a thorough evaluation of
existing multilingual sentence embeddings and our improved versions, which
include a collection of five transfer tasks in different downstream
applications. Experiment results show that retrofitting multilingual sentence
embeddings with AMR leads to better state-of-the-art performance on both
semantic textual similarity and transfer tasks. Our codebase and evaluation
scripts can be found at \url{https://github.com/jcyk/MSE-AMR}.
Related papers
- MINERS: Multilingual Language Models as Semantic Retrievers [23.686762008696547]
This paper introduces the MINERS, a benchmark designed to evaluate the ability of multilingual language models in semantic retrieval tasks.
We create a comprehensive framework to assess the robustness of LMs in retrieving samples across over 200 diverse languages.
Our results demonstrate that by solely retrieving semantically similar embeddings yields performance competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-11T16:26:18Z) - Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - Lost in Translationese? Reducing Translation Effect Using Abstract
Meaning Representation [11.358350306918027]
We argue that Abstract Meaning Representation (AMR) can be used as an interlingua to reduce the amount of translationese in translated texts.
By parsing English translations into an AMR and then generating text from that AMR, the result more closely resembles originally English text.
This work makes strides towards reducing translationese in text and highlights the utility of AMR as an interlingua.
arXiv Detail & Related papers (2023-04-23T00:04:14Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - SBERT studies Meaning Representations: Decomposing Sentence Embeddings
into Explainable AMR Meaning Features [22.8438857884398]
We create similarity metrics that are highly effective, while also providing an interpretable rationale for their rating.
Our approach works in two steps: We first select AMR graph metrics that measure meaning similarity of sentences with respect to key semantic facets.
Second, we employ these metrics to induce Semantically Structured Sentence BERT embeddings, which are composed of different meaning aspects captured in different sub-spaces.
arXiv Detail & Related papers (2022-06-14T17:37:18Z) - Transition-based Abstract Meaning Representation Parsing with Contextual
Embeddings [0.0]
We study a way of combing two of the most successful routes to meaning of language--statistical language models and symbolic semantics formalisms--in the task of semantic parsing.
We explore the utility of incorporating pretrained context-aware word embeddings--such as BERT and RoBERTa--in the problem of parsing.
arXiv Detail & Related papers (2022-06-13T15:05:24Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.