Improving AMR Parsing with Sequence-to-Sequence Pre-training
- URL: http://arxiv.org/abs/2010.01771v1
- Date: Mon, 5 Oct 2020 04:32:47 GMT
- Title: Improving AMR Parsing with Sequence-to-Sequence Pre-training
- Authors: Dongqin Xu, Junhui Li, Muhua Zhu, Min Zhang, Guodong Zhou
- Abstract summary: In this paper, we focus on sequence-to-sequence (seq2seq) AMR parsing.
We propose a seq2seq pre-training approach to build pre-trained models in both single and joint way.
Experiments show that both the single and joint pre-trained models significantly improve the performance.
- Score: 39.33133978535497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the literature, the research on abstract meaning representation (AMR)
parsing is much restricted by the size of human-curated dataset which is
critical to build an AMR parser with good performance. To alleviate such data
size restriction, pre-trained models have been drawing more and more attention
in AMR parsing. However, previous pre-trained models, like BERT, are
implemented for general purpose which may not work as expected for the specific
task of AMR parsing. In this paper, we focus on sequence-to-sequence (seq2seq)
AMR parsing and propose a seq2seq pre-training approach to build pre-trained
models in both single and joint way on three relevant tasks, i.e., machine
translation, syntactic parsing, and AMR parsing itself. Moreover, we extend the
vanilla fine-tuning method to a multi-task learning fine-tuning method that
optimizes for the performance of AMR parsing while endeavors to preserve the
response of pre-trained models. Extensive experimental results on two English
benchmark datasets show that both the single and joint pre-trained models
significantly improve the performance (e.g., from 71.5 to 80.2 on AMR 2.0),
which reaches the state of the art. The result is very encouraging since we
achieve this with seq2seq models rather than complex models. We make our code
and model available at https://github.com/xdqkid/S2S-AMR-Parser.
Related papers
- AMR Parsing with Instruction Fine-tuned Pre-trained Language Models [21.767812442354387]
In this paper, we take one of such instruction fine-tuned language models, i.e. FLAN-T5, and fine-tune them for AMR parsing.
Our experiments on various AMR parsing tasks including AMR2.0, AMR3.0 and BioAMR indicate that FLAN-T5 fine-tuned models out-perform previous state-of-the-art models.
arXiv Detail & Related papers (2023-04-24T17:12:17Z) - Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction [15.312021665242154]
We propose Uni-QSAR, a powerful Auto-ML tool for molecule property prediction tasks.
Uni-QSAR combines molecular representation learning (MRL) of 1D sequential tokens, 2D topology graphs, and 3D conformers with pretraining models to leverage rich representation from large-scale unlabeled data.
arXiv Detail & Related papers (2023-04-24T16:29:08Z) - Generating Query Focused Summaries without Fine-tuning the
Transformer-based Pre-trained Models [0.6124773188525718]
Fine-tuning the Natural Language Processing (NLP) models for each new data set requires higher computational time associated with increased carbon footprint and cost.
In this paper, we try to omit the fine-tuning steps and investigate whether the Marginal Maximum Relevance (MMR)-based approach can help the pre-trained models to obtain query-focused summaries directly from a new data set that was not used to pre-train the models.
As indicated by the experimental results, our MMR-based approach successfully ranked and selected the most relevant sentences as summaries and showed better performance than the individual pre-trained models.
arXiv Detail & Related papers (2023-03-10T22:40:15Z) - From Cloze to Comprehension: Retrofitting Pre-trained Masked Language
Model to Pre-trained Machine Reader [130.45769668885487]
Pre-trained Machine Reader (PMR) is a novel method for retrofitting masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.
To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data.
PMR has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.
arXiv Detail & Related papers (2022-12-09T10:21:56Z) - MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided
Adaptation [68.30497162547768]
We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed.
We validate the efficiency and effectiveness of MoEBERT on natural language understanding and question answering tasks.
arXiv Detail & Related papers (2022-04-15T23:19:37Z) - An EM Approach to Non-autoregressive Conditional Sequence Generation [49.11858479436565]
Autoregressive (AR) models have been the dominating approach to conditional sequence generation.
Non-autoregressive (NAR) models have been recently proposed to reduce the latency by generating all output tokens in parallel.
This paper proposes a new approach that jointly optimize both AR and NAR models in a unified Expectation-Maximization framework.
arXiv Detail & Related papers (2020-06-29T20:58:57Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z) - GPT-too: A language-model-first approach for AMR-to-text generation [22.65728041544785]
We propose an approach that combines a strong pre-trained language model with cycle consistency-based re-scoring.
Despite the simplicity of the approach, our experimental results show these models outperform all previous techniques.
arXiv Detail & Related papers (2020-05-18T22:50:26Z) - AMR Parsing via Graph-Sequence Iterative Inference [62.85003739964878]
We propose a new end-to-end model that treats AMR parsing as a series of dual decisions on the input sequence and the incrementally constructed graph.
We show that the answers to these two questions are mutually causalities.
We design a model based on iterative inference that helps achieve better answers in both perspectives, leading to greatly improved parsing accuracy.
arXiv Detail & Related papers (2020-04-12T09:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.