KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
- URL: http://arxiv.org/abs/2010.02307v2
- Date: Sun, 11 Oct 2020 18:09:49 GMT
- Title: KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
- Authors: Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Abstract summary: We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
- Score: 100.79870384880333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-to-text generation has recently attracted substantial interests due to
its wide applications. Existing methods have shown impressive performance on an
array of tasks. However, they rely on a significant amount of labeled data for
each task, which is costly to acquire and thus limits their application to new
tasks and domains. In this paper, we propose to leverage pre-training and
transfer learning to address this issue. We propose a knowledge-grounded
pre-training (KGPT), which consists of two parts, 1) a general
knowledge-grounded generation model to generate knowledge-enriched text. 2) a
pre-training paradigm on a massive knowledge-grounded text corpus crawled from
the web. The pre-trained model can be fine-tuned on various data-to-text
generation tasks to generate task-specific text. We adopt three settings,
namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under the fully-supervised setting, our model can achieve remarkable gains over
the known baselines. Under zero-shot setting, our model without seeing any
examples achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
Under the few-shot setting, our model only needs about one-fifteenth as many
labeled examples to achieve the same level of performance as baseline models.
These experiments consistently prove the strong generalization ability of our
proposed framework https://github.com/wenhuchen/KGPT.
Related papers
- Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Using Large Language Models for Zero-Shot Natural Language Generation
from Knowledge Graphs [4.56877715768796]
We show that ChatGPT achieves near state-of-the-art performance on some measures of the WebNLG 2020 challenge.
We also show that there is a significant connection between what the LLM already knows about the data it is parsing and the quality of the output text.
arXiv Detail & Related papers (2023-07-14T12:45:03Z) - ReGen: Zero-Shot Text Classification via Training Data Generation with
Progressive Dense Retrieval [22.882301169283323]
We propose a retrieval-enhanced framework to create training data from a general-domain unlabeled corpus.
Experiments on nine datasets demonstrate that REGEN achieves 4.3% gain over the strongest baselines and saves around 70% of the time compared to baselines using large NLG models.
arXiv Detail & Related papers (2023-05-18T04:30:09Z) - Knowledge Graph Generation From Text [18.989264255589806]
We propose a novel end-to-end Knowledge Graph (KG) generation system from textual inputs.
The graph nodes are generated first using pretrained language model, followed by a simple edge construction head.
We evaluated the model on a recent WebNLG 2020 Challenge dataset, matching the state-of-the-art performance on text-to-RDF generation task.
arXiv Detail & Related papers (2022-11-18T21:27:13Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - Generate, Annotate, and Learn: Generative Models Advance Self-Training
and Knowledge Distillation [58.64720318755764]
Semi-Supervised Learning (SSL) has seen success in many application domains, but this success often hinges on the availability of task-specific unlabeled data.
Knowledge distillation (KD) has enabled compressing deep networks and ensembles, achieving the best results when distilling knowledge on fresh task-specific unlabeled examples.
We present a general framework called "generate, annotate, and learn (GAL)" that uses unconditional generative models to synthesize in-domain unlabeled data.
arXiv Detail & Related papers (2021-06-11T05:01:24Z) - Few-shot Knowledge Graph-to-Text Generation with Pretrained Language
Models [42.38563175680914]
This paper studies how to automatically generate a natural language text that describes the facts in knowledge graph (KG)
Considering the few-shot setting, we leverage the excellent capacities of pretrained language models (PLMs) in language understanding and generation.
arXiv Detail & Related papers (2021-06-03T06:48:00Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.