Measuring Fine-Grained Semantic Equivalence with Abstract Meaning
Representation
- URL: http://arxiv.org/abs/2210.03018v1
- Date: Thu, 6 Oct 2022 16:08:27 GMT
- Title: Measuring Fine-Grained Semantic Equivalence with Abstract Meaning
Representation
- Authors: Shira Wein, Zhuxin Wang, Nathan Schneider
- Abstract summary: Identifying semantically equivalent sentences is important for many NLP tasks.
Current approaches to semantic equivalence take a loose, sentence-level approach to "equivalence"
We introduce a novel, more sensitive method of characterizing semantic equivalence that leverages Abstract Meaning Representation graph structures.
- Score: 9.666975331506812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifying semantically equivalent sentences is important for many
cross-lingual and mono-lingual NLP tasks. Current approaches to semantic
equivalence take a loose, sentence-level approach to "equivalence," despite
previous evidence that fine-grained differences and implicit content have an
effect on human understanding (Roth and Anthonio, 2021) and system performance
(Briakou and Carpuat, 2021). In this work, we introduce a novel, more sensitive
method of characterizing semantic equivalence that leverages Abstract Meaning
Representation graph structures. We develop an approach, which can be used with
either gold or automatic AMR annotations, and demonstrate that our solution is
in fact finer-grained than existing corpus filtering methods and more accurate
at predicting strictly equivalent sentences than existing semantic similarity
metrics. We suggest that our finer-grained measure of semantic equivalence
could limit the workload in the task of human post-edited machine translation
and in human evaluation of sentence similarity.
Related papers
- DenoSent: A Denoising Objective for Self-Supervised Sentence
Representation Learning [59.4644086610381]
We propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective.
By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form.
Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks.
arXiv Detail & Related papers (2024-01-24T17:48:45Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - Semantic-aware Contrastive Learning for More Accurate Semantic Parsing [32.74456368167872]
We propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations.
Experiments on two standard datasets show that our approach achieves significant improvements over MLE baselines.
arXiv Detail & Related papers (2023-01-19T07:04:32Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - SBERT studies Meaning Representations: Decomposing Sentence Embeddings
into Explainable AMR Meaning Features [22.8438857884398]
We create similarity metrics that are highly effective, while also providing an interpretable rationale for their rating.
Our approach works in two steps: We first select AMR graph metrics that measure meaning similarity of sentences with respect to key semantic facets.
Second, we employ these metrics to induce Semantically Structured Sentence BERT embeddings, which are composed of different meaning aspects captured in different sub-spaces.
arXiv Detail & Related papers (2022-06-14T17:37:18Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Rethinking Crowd Sourcing for Semantic Similarity [0.13999481573773073]
This paper investigates the ambiguities inherent in crowd-sourced semantic labeling.
It shows that annotators that treat semantic similarity as a binary category play the most important role in the labeling.
arXiv Detail & Related papers (2021-09-24T13:57:30Z) - On the Sentence Embeddings from Pre-trained Language Models [78.45172445684126]
In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited.
We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity.
We propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.
arXiv Detail & Related papers (2020-11-02T13:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.