Multilingual AMR Parsing with Noisy Knowledge Distillation
- URL: http://arxiv.org/abs/2109.15196v1
- Date: Thu, 30 Sep 2021 15:13:48 GMT
- Title: Multilingual AMR Parsing with Noisy Knowledge Distillation
- Authors: Deng Cai and Xin Li and Jackie Chun-Sing Ho and Lidong Bing and Wai
Lam
- Abstract summary: We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR by using an existing English as its teacher.
We identify that noisy input and precise output are the key to successful distillation.
- Score: 68.01173640691094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study multilingual AMR parsing from the perspective of knowledge
distillation, where the aim is to learn and improve a multilingual AMR parser
by using an existing English parser as its teacher. We constrain our
exploration in a strict multilingual setting: there is but one model to parse
all different languages including English. We identify that noisy input and
precise output are the key to successful distillation. Together with extensive
pre-training, we obtain an AMR parser whose performances surpass all previously
published results on four different foreign languages, including German,
Spanish, Italian, and Chinese, by large margins (up to 18.8 \textsc{Smatch}
points on Chinese and on average 11.3 \textsc{Smatch} points). Our parser also
achieves comparable performance on English to the latest state-of-the-art
English-only parser.
Related papers
- Should Cross-Lingual AMR Parsing go Meta? An Empirical Assessment of Meta-Learning and Joint Learning AMR Parsing [8.04933271357397]
Cross-lingual AMR parsing is the task of predicting AMR graphs in a target language when training data is available only in a source language.
Taking inspiration from Langedijk et al. (2022), we investigate the use of meta-learning for cross-lingual AMR parsing.
We evaluate our models in $k$-shot scenarios and assess their effectiveness in Croatian, Farsi, Korean, Chinese, and French.
arXiv Detail & Related papers (2024-10-04T12:24:02Z) - Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations [59.056367787688146]
This paper pioneers exploring and training powerful Multilingual Math Reasoning (xMR) LLMs.
We construct the first multilingual math reasoning instruction dataset, MGSM8KInstruct, encompassing ten distinct languages.
By utilizing translation, we construct the first multilingual math reasoning instruction dataset, MGSM8KInstruct, encompassing ten distinct languages.
arXiv Detail & Related papers (2023-10-31T08:09:20Z) - Revisiting non-English Text Simplification: A Unified Multilingual
Benchmark [14.891068432456262]
This paper introduces the MultiSim benchmark, a collection of 27 resources in 12 distinct languages containing over 1.7 million complex-simple sentence pairs.
Our experiments using MultiSim with pre-trained multilingual language models reveal exciting performance improvements from multilingual training in non-English settings.
arXiv Detail & Related papers (2023-05-25T03:03:29Z) - Meta-Learning a Cross-lingual Manifold for Semantic Parsing [75.26271012018861]
Localizing a semantic to support new languages requires effective cross-lingual generalization.
We introduce a first-order meta-learning algorithm to train a semantic annotated with maximal sample efficiency during cross-lingual transfer.
Results across six languages on ATIS demonstrate that our combination of steps yields accurate semantics sampling $le$10% of source training data in each new language.
arXiv Detail & Related papers (2022-09-26T10:42:17Z) - Making Better Use of Bilingual Information for Cross-Lingual AMR Parsing [88.08581016329398]
We argue that the misprediction of concepts is due to the high relevance between English tokens and AMR concepts.
We introduce bilingual input, namely the translated texts as well as non-English texts, in order to enable the model to predict more accurate concepts.
arXiv Detail & Related papers (2021-06-09T05:14:54Z) - Translate, then Parse! A strong baseline for Cross-Lingual AMR Parsing [10.495114898741205]
We develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures.
In this paper, we revisit a simple two-step base-line, and enhance it with a strong NMT system and a strong AMR.
Our experiments show that T+P outperforms a recent state-of-the-art system across all tested languages.
arXiv Detail & Related papers (2021-06-08T17:52:48Z) - Bootstrapping Multilingual AMR with Contextual Word Alignments [15.588190959488538]
We develop a novel technique forforeign-text-to-English AMR alignment, usingthe contextual word alignment between En-glish and foreign language tokens.
This wordalignment is weakly supervised and relies onthe contextualized XLM-R word embeddings.
We achieve a highly competitive performancethat surpasses the best published results forGerman, Italian, Spanish and Chinese.
arXiv Detail & Related papers (2021-02-03T18:35:55Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - Knowledge Distillation for Multilingual Unsupervised Neural Machine
Translation [61.88012735215636]
Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs.
UNMT can only translate between a single language pair and cannot produce translation results for multiple language pairs at the same time.
In this paper, we empirically introduce a simple method to translate between thirteen languages using a single encoder and a single decoder.
arXiv Detail & Related papers (2020-04-21T17:26:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.