Towards a Decomposable Metric for Explainable Evaluation of Text
Generation from AMR
- URL: http://arxiv.org/abs/2008.08896v3
- Date: Tue, 26 Jan 2021 08:55:07 GMT
- Title: Towards a Decomposable Metric for Explainable Evaluation of Text
Generation from AMR
- Authors: Juri Opitz and Anette Frank
- Abstract summary: AMR systems are typically evaluated using metrics that compare the generated texts to reference texts from which the input meaning representations were constructed.
We show that besides well-known issues from which such metrics suffer, an additional problem arises when applying these metrics for AMR-to-text evaluation.
We show that fulfillment of both principles offers benefits for AMR-to-text evaluation, including explainability of scores.
- Score: 22.8438857884398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Systems that generate natural language text from abstract meaning
representations such as AMR are typically evaluated using automatic surface
matching metrics that compare the generated texts to reference texts from which
the input meaning representations were constructed. We show that besides
well-known issues from which such metrics suffer, an additional problem arises
when applying these metrics for AMR-to-text evaluation, since an abstract
meaning representation allows for numerous surface realizations. In this work
we aim to alleviate these issues by proposing $\mathcal{M}\mathcal{F}_\beta$, a
decomposable metric that builds on two pillars. The first is the principle of
meaning preservation $\mathcal{M}$: it measures to what extent a given AMR can
be reconstructed from the generated sentence using SOTA AMR parsers and
applying (fine-grained) AMR evaluation metrics to measure the distance between
the original and the reconstructed AMR. The second pillar builds on a principle
of (grammatical) form $\mathcal{F}$ that measures the linguistic quality of the
generated text, which we implement using SOTA language models. In two extensive
pilot studies we show that fulfillment of both principles offers benefits for
AMR-to-text evaluation, including explainability of scores. Since
$\mathcal{M}\mathcal{F}_\beta$ does not necessarily rely on gold AMRs, it may
extend to other text generation tasks.
Related papers
- AMR Parsing is Far from Solved: GrAPES, the Granular AMR Parsing
Evaluation Suite [18.674172788583967]
Granular AMR Parsing Evaluation Suite (GrAPES)
We present the Granular AMR Parsing Evaluation Suite (GrAPES)
GrAPES reveals in depth the abilities and shortcomings of current AMRs.
arXiv Detail & Related papers (2023-12-06T13:19:56Z) - Leveraging Denoised Abstract Meaning Representation for Grammatical
Error Correction [53.55440811942249]
Grammatical Error Correction (GEC) is the task of correcting errorful sentences into grammatically correct, semantically consistent, and coherent sentences.
We propose the AMR-GEC, a seq-to-seq model that incorporates denoised AMR as additional knowledge.
arXiv Detail & Related papers (2023-07-05T09:06:56Z) - An AMR-based Link Prediction Approach for Document-level Event Argument
Extraction [51.77733454436013]
Recent works have introduced Abstract Meaning Representation (AMR) for Document-level Event Argument Extraction (Doc-level EAE)
This work reformulates EAE as a link prediction problem on AMR graphs.
We propose a novel graph structure, Tailored AMR Graph (TAG), which compresses less informative subgraphs and edge types, integrates span information, and highlights surrounding events in the same document.
arXiv Detail & Related papers (2023-05-30T16:07:48Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - A Survey : Neural Networks for AMR-to-Text [2.3924114046608627]
AMR-to-Text is one of the key techniques in the NLP community that aims at generating sentences from the Abstract Meaning Representation (AMR) graphs.
Since AMR was proposed in 2013, the study on AMR-to-Text has become increasingly prevalent as an essential branch of structured data to text.
arXiv Detail & Related papers (2022-06-15T07:20:28Z) - A Dynamic, Interpreted CheckList for Meaning-oriented NLG Metric
Evaluation -- through the Lens of Semantic Similarity Rating [19.33681537640272]
We develop a CheckList for NLG metrics that is organized around meaning-relevant linguistic phenomena.
Each test instance consists of a pair of sentences with their AMR graphs and a human-produced textual semantic similarity or relatedness score.
We demonstrate the usefulness of CheckList by designing a new metric GraCo that computes lexical cohesion graphs over AMR concepts.
arXiv Detail & Related papers (2022-05-24T16:19:32Z) - Inducing and Using Alignments for Transition-based AMR Parsing [51.35194383275297]
We propose a neural aligner for AMR that learns node-to-word alignments without relying on complex pipelines.
We attain a new state-of-the art for gold-only trained models, matching silver-trained performance without the need for beam search on AMR3.0.
arXiv Detail & Related papers (2022-05-03T12:58:36Z) - Smelting Gold and Silver for Improved Multilingual AMR-to-Text
Generation [55.117031558677674]
We study different techniques for automatically generating AMR annotations.
Our models trained on gold AMR with silver (machine translated) sentences outperform approaches which leverage generated silver AMR.
Our models surpass the previous state of the art for German, Italian, Spanish, and Chinese by a large margin.
arXiv Detail & Related papers (2021-09-08T17:55:46Z) - Probabilistic, Structure-Aware Algorithms for Improved Variety,
Accuracy, and Coverage of AMR Alignments [9.74672460306765]
We present algorithms for aligning components of Abstract Meaning Representation (AMR) spans in English sentences.
We leverage unsupervised learning in combination with graphs, taking the best of both worlds from previous AMR.
Our approach covers a wider variety of AMR substructures than previously considered, achieves higher coverage of nodes and edges, and does so with higher accuracy.
arXiv Detail & Related papers (2021-06-10T18:46:32Z) - Making Better Use of Bilingual Information for Cross-Lingual AMR Parsing [88.08581016329398]
We argue that the misprediction of concepts is due to the high relevance between English tokens and AMR concepts.
We introduce bilingual input, namely the translated texts as well as non-English texts, in order to enable the model to predict more accurate concepts.
arXiv Detail & Related papers (2021-06-09T05:14:54Z) - Translate, then Parse! A strong baseline for Cross-Lingual AMR Parsing [10.495114898741205]
We develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures.
In this paper, we revisit a simple two-step base-line, and enhance it with a strong NMT system and a strong AMR.
Our experiments show that T+P outperforms a recent state-of-the-art system across all tested languages.
arXiv Detail & Related papers (2021-06-08T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.