Translate, then Parse! A strong baseline for Cross-Lingual AMR Parsing
- URL: http://arxiv.org/abs/2106.04565v1
- Date: Tue, 8 Jun 2021 17:52:48 GMT
- Title: Translate, then Parse! A strong baseline for Cross-Lingual AMR Parsing
- Authors: Sarah Uhrig, Yoalli Rezepka Garcia, Juri Opitz, Anette Frank
- Abstract summary: We develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures.
In this paper, we revisit a simple two-step base-line, and enhance it with a strong NMT system and a strong AMR.
Our experiments show that T+P outperforms a recent state-of-the-art system across all tested languages.
- Score: 10.495114898741205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In cross-lingual Abstract Meaning Representation (AMR) parsing, researchers
develop models that project sentences from various languages onto their AMRs to
capture their essential semantic structures: given a sentence in any language,
we aim to capture its core semantic content through concepts connected by
manifold types of semantic relations. Methods typically leverage large silver
training data to learn a single model that is able to project non-English
sentences to AMRs. However, we find that a simple baseline tends to be
over-looked: translating the sentences to English and projecting their AMR with
a monolingual AMR parser (translate+parse,T+P). In this paper, we revisit this
simple two-step base-line, and enhance it with a strong NMT system and a strong
AMR parser. Our experiments show that T+P outperforms a recent state-of-the-art
system across all tested languages: German, Italian, Spanish and Mandarin with
+14.6, +12.6, +14.3 and +16.0 Smatch points.
Related papers
- Should Cross-Lingual AMR Parsing go Meta? An Empirical Assessment of Meta-Learning and Joint Learning AMR Parsing [8.04933271357397]
Cross-lingual AMR parsing is the task of predicting AMR graphs in a target language when training data is available only in a source language.
Taking inspiration from Langedijk et al. (2022), we investigate the use of meta-learning for cross-lingual AMR parsing.
We evaluate our models in $k$-shot scenarios and assess their effectiveness in Croatian, Farsi, Korean, Chinese, and French.
arXiv Detail & Related papers (2024-10-04T12:24:02Z) - MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection [3.6811136816751513]
We introduce MASSIVE-AMR, a dataset with more than 84,000 text-to-graph annotations.
AMR graphs for 1,685 information-seeking utterances mapped to 50+ typologically diverse languages.
Results shed light on persistent issues using LLMs for structured parsing.
arXiv Detail & Related papers (2024-05-29T17:17:22Z) - "You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of
Abstract Meaning Representation [60.863629647985526]
We examine the successes and limitations of the GPT-3, ChatGPT, and GPT-4 models in analysis of sentence meaning structure.
We find that models can reliably reproduce the basic format of AMR, and can often capture core event, argument, and modifier structure.
Overall, our findings indicate that these models out-of-the-box can capture aspects of semantic structure, but there remain key limitations in their ability to support fully accurate semantic analyses or parses.
arXiv Detail & Related papers (2023-10-26T21:47:59Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - Multilingual AMR Parsing with Noisy Knowledge Distillation [68.01173640691094]
We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR by using an existing English as its teacher.
We identify that noisy input and precise output are the key to successful distillation.
arXiv Detail & Related papers (2021-09-30T15:13:48Z) - Smelting Gold and Silver for Improved Multilingual AMR-to-Text
Generation [55.117031558677674]
We study different techniques for automatically generating AMR annotations.
Our models trained on gold AMR with silver (machine translated) sentences outperform approaches which leverage generated silver AMR.
Our models surpass the previous state of the art for German, Italian, Spanish, and Chinese by a large margin.
arXiv Detail & Related papers (2021-09-08T17:55:46Z) - Making Better Use of Bilingual Information for Cross-Lingual AMR Parsing [88.08581016329398]
We argue that the misprediction of concepts is due to the high relevance between English tokens and AMR concepts.
We introduce bilingual input, namely the translated texts as well as non-English texts, in order to enable the model to predict more accurate concepts.
arXiv Detail & Related papers (2021-06-09T05:14:54Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.