Related papers: Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

URL: http://arxiv.org/abs/2210.09773v1
Date: Tue, 18 Oct 2022 11:37:36 GMT
Title: Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation
Authors: Deng Cai and Xin Li and Jackie Chun-Sing Ho and Lidong Bing and Wai Lam
Abstract summary: We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR) Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously. Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
Score: 70.58243648754507
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR). Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously. It also helps reduce surface variations across different expressions and languages. Unlike most prior work that only evaluates the ability to measure semantic similarity, we present a thorough evaluation of existing multilingual sentence embeddings and our improved versions, which include a collection of five transfer tasks in different downstream applications. Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic textual similarity and transfer tasks. Our codebase and evaluation scripts can be found at \url{https://github.com/jcyk/MSE-AMR}.

Related papers

Intrinsic vs. Extrinsic Evaluation of Czech Sentence Embeddings: Semantic Relevance Doesn't Help with MT Evaluation [0.0]
In this paper, we compare Czech-specific and multilingual sentence embedding models through intrinsic and extrinsic evaluation paradigms.<n>For intrinsic evaluation, we employ Costra, a complex sentence transformation dataset, and several Semantic Textual Similarity (STS) benchmarks to assess the ability of the embeddings to capture linguistic phenomena.<n>In the extrinsic evaluation, we fine-tune each embedding model using COMET-based metrics for machine translation evaluation.
arXiv Detail & Related papers (2025-06-25T07:46:17Z)
Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs) We form "semantic tokens" by merging the semantically similar subwords and their embeddings. inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z)
MINERS: Multilingual Language Models as Semantic Retrievers [23.686762008696547]
This paper introduces the MINERS, a benchmark designed to evaluate the ability of multilingual language models in semantic retrieval tasks. We create a comprehensive framework to assess the robustness of LMs in retrieving samples across over 200 diverse languages. Our results demonstrate that by solely retrieving semantically similar embeddings yields performance competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-11T16:26:18Z)
Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion. It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing. Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z)
Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z)
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning. It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework. An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z)
Lost in Translationese? Reducing Translation Effect Using Abstract Meaning Representation [11.358350306918027]
We argue that Abstract Meaning Representation (AMR) can be used as an interlingua to reduce the amount of translationese in translated texts. By parsing English translations into an AMR and then generating text from that AMR, the result more closely resembles originally English text. This work makes strides towards reducing translationese in text and highlights the utility of AMR as an interlingua.
arXiv Detail & Related papers (2023-04-23T00:04:14Z)
Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings. RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z)
SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable AMR Meaning Features [22.8438857884398]
We create similarity metrics that are highly effective, while also providing an interpretable rationale for their rating. Our approach works in two steps: We first select AMR graph metrics that measure meaning similarity of sentences with respect to key semantic facets. Second, we employ these metrics to induce Semantically Structured Sentence BERT embeddings, which are composed of different meaning aspects captured in different sub-spaces.
arXiv Detail & Related papers (2022-06-14T17:37:18Z)
Transition-based Abstract Meaning Representation Parsing with Contextual Embeddings [0.0]
We study a way of combing two of the most successful routes to meaning of language--statistical language models and symbolic semantics formalisms--in the task of semantic parsing. We explore the utility of incorporating pretrained context-aware word embeddings--such as BERT and RoBERTa--in the problem of parsing.
arXiv Detail & Related papers (2022-06-13T15:05:24Z)
Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation. This paper aims to address the issue with a mask-and-predict strategy. We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions. Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.