Stage-wise Fine-tuning for Graph-to-Text Generation
- URL: http://arxiv.org/abs/2105.08021v1
- Date: Mon, 17 May 2021 17:15:29 GMT
- Title: Stage-wise Fine-tuning for Graph-to-Text Generation
- Authors: Qingyun Wang, Semih Yavuz, Victoria Lin, Heng Ji, Nazneen Rajani
- Abstract summary: Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders.
We propose a structured graph-to-text model with a two-step fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting to the graph-to-text generation.
- Score: 25.379346921398326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph-to-text generation has benefited from pre-trained language models
(PLMs) in achieving better performance than structured graph encoders. However,
they fail to fully utilize the structure information of the input graph. In
this paper, we aim to further improve the performance of the pre-trained
language model by proposing a structured graph-to-text model with a two-step
fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting
to the graph-to-text generation. In addition to using the traditional token and
position embeddings to encode the knowledge graph (KG), we propose a novel
tree-level embedding method to capture the inter-dependency structures of the
input graph. This new approach has significantly improved the performance of
all text generation metrics for the English WebNLG 2017 dataset.
Related papers
- A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Knowledge Graph Generation From Text [18.989264255589806]
We propose a novel end-to-end Knowledge Graph (KG) generation system from textual inputs.
The graph nodes are generated first using pretrained language model, followed by a simple edge construction head.
We evaluated the model on a recent WebNLG 2020 Challenge dataset, matching the state-of-the-art performance on text-to-RDF generation task.
arXiv Detail & Related papers (2022-11-18T21:27:13Z) - Improving Graph-Based Text Representations with Character and Word Level
N-grams [30.699644290131044]
We propose a new word-character text graph that combines word and character n-gram nodes together with document nodes.
We also propose two new graph-based neural models, WCTextGCN and WCTextGAT, for modeling our proposed text graph.
arXiv Detail & Related papers (2022-10-12T08:07:54Z) - GAP: A Graph-aware Language Model Framework for Knowledge Graph-to-Text
Generation [3.593955557310285]
Recent improvements in KG-to-text generation are due to auxiliary pre-training tasks designed to give the fine-tune task a boost in performance.
Here, we demonstrate that by fusing graph-aware elements into existing pre-trained language models, we are able to outperform state-of-the-art models and close the gap imposed by additional pre-training tasks.
arXiv Detail & Related papers (2022-04-13T23:53:37Z) - JointGT: Graph-Text Joint Representation Learning for Text Generation
from Knowledge Graphs [44.06715423776722]
We propose a graph-text joint representation learning model called JointGT.
During encoding, we devise a structure-aware semantic aggregation module which is plugged into each Transformer layer.
We show that JointGT obtains new state-of-the-art performance on various KG-to-text datasets.
arXiv Detail & Related papers (2021-06-19T14:10:10Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - Structural Adapters in Pretrained Language Models for AMR-to-text
Generation [59.50420985074769]
Previous work on text generation from graph-structured data relies on pretrained language models (PLMs)
We propose StructAdapt, an adapter method to encode graph structure into PLMs.
arXiv Detail & Related papers (2021-03-16T15:06:50Z) - Structural Information Preserving for Graph-to-Text Generation [59.00642847499138]
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.
Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.
arXiv Detail & Related papers (2021-02-12T20:09:01Z) - Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures.
Our findings motivate solutions to enrich the quality of models' implicit graph encodings.
We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z) - Modeling Graph Structure via Relative Position for Text Generation from
Knowledge Graphs [54.176285420428776]
We present Graformer, a novel Transformer-based encoder-decoder architecture for graph-to-text generation.
With our novel graph self-attention, the encoding of a node relies on all nodes in the input graph - not only direct neighbors - facilitating the detection of global patterns.
Graformer learns to weight these node-node relations differently for different attention heads, thus virtually learning differently connected views of the input graph.
arXiv Detail & Related papers (2020-06-16T15:20:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.