A multi-task semi-supervised framework for Text2Graph & Graph2Text
- URL: http://arxiv.org/abs/2202.06041v1
- Date: Sat, 12 Feb 2022 11:02:17 GMT
- Title: A multi-task semi-supervised framework for Text2Graph & Graph2Text
- Authors: Oriol Domingo, Marta R. Costa-juss\`a and Carlos Escolano
- Abstract summary: We jointly learn graph extraction from text and text generation from graphs.
Our approach surpasses unsupervised state-of-the-art results in text-to-graph and graph-to-text.
The resulting model can be easily trained in any new domain with non-parallel data.
- Score: 2.2344764434954256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Artificial Intelligence industry regularly develops applications that
mostly rely on Knowledge Bases, a data repository about specific, or general,
domains, usually represented in a graph shape. Similar to other databases, they
face two main challenges: information ingestion and information retrieval. We
approach these challenges by jointly learning graph extraction from text and
text generation from graphs. The proposed solution, a T5 architecture, is
trained in a multi-task semi-supervised environment, with our collected
non-parallel data, following a cycle training regime. Experiments on WebNLG
dataset show that our approach surpasses unsupervised state-of-the-art results
in text-to-graph and graph-to-text. More relevantly, our framework is more
consistent across seen and unseen domains than supervised models. The resulting
model can be easily trained in any new domain with non-parallel data, by simply
adding text and graphs about it, in our cycle framework.
Related papers
- Bridging Local Details and Global Context in Text-Attributed Graphs [62.522550655068336]
GraphBridge is a framework that bridges local and global perspectives by leveraging contextual textual information.
Our method achieves state-of-theart performance, while our graph-aware token reduction module significantly enhances efficiency and solves scalability issues.
arXiv Detail & Related papers (2024-06-18T13:35:25Z) - TAGA: Text-Attributed Graph Self-Supervised Learning by Synergizing Graph and Text Mutual Transformations [15.873944819608434]
Text-Attributed Graphs (TAGs) enhance graph structures with natural language descriptions.
This paper introduces a new self-supervised learning framework, Text-And-Graph Multi-View Alignment (TAGA), which integrates TAGs' structural and semantic dimensions.
Our framework demonstrates strong performance in zero-shot and few-shot scenarios across eight real-world datasets.
arXiv Detail & Related papers (2024-05-27T03:40:16Z) - UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs [30.635472655668078]
Text-Attributed Graphs (TAGs) can generalize to unseen graphs and tasks across diverse domains.
We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks.
We demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer.
arXiv Detail & Related papers (2024-02-21T09:06:31Z) - Pretraining Language Models with Text-Attributed Heterogeneous Graphs [28.579509154284448]
We present a new pretraining framework for Language Models (LMs) that explicitly considers the topological and heterogeneous information in Text-Attributed Heterogeneous Graphs (TAHGs)
We propose a topology-aware pretraining task to predict nodes involved in the context graph by jointly optimizing an LM and an auxiliary heterogeneous graph neural network.
We conduct link prediction and node classification tasks on three datasets from various domains.
arXiv Detail & Related papers (2023-10-19T08:41:21Z) - One for All: Towards Training One Graph Model for All Classification Tasks [61.656962278497225]
A unified model for various graph tasks remains underexplored, primarily due to the challenges unique to the graph learning domain.
We propose textbfOne for All (OFA), the first general framework that can use a single graph model to address the above challenges.
OFA performs well across different tasks, making it the first general-purpose across-domains classification model on graphs.
arXiv Detail & Related papers (2023-09-29T21:15:26Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - INFINITY: A Simple Yet Effective Unsupervised Framework for Graph-Text
Mutual Conversion [43.70416280548082]
Graph-to-text (G2T) generation and text-to-graph (T2G) triple extraction are essential tasks for constructing and applying knowledge graphs.
Existing unsupervised approaches turn out to be suitable candidates for jointly learning the two tasks due to their avoidance of using graph-text parallel data.
We propose INFINITY, a simple yet effective unsupervised approach that does not require external annotation tools or additional parallel information.
arXiv Detail & Related papers (2022-09-22T03:12:43Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - EventNarrative: A large-scale Event-centric Dataset for Knowledge
Graph-to-Text Generation [8.216976747904726]
EventNarrative consists of approximately 230,000 graphs and their corresponding natural language text, 6 times larger than the current largest parallel dataset.
Our aim is two-fold: help break new ground in event-centric research where data is lacking, and to give researchers a well-defined, large-scale dataset.
arXiv Detail & Related papers (2021-10-30T15:39:20Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via
Cycle Training [63.11444020743543]
Deep learning models for graph-to-text (G2T) and text-to-graph (T2G) conversion suffer from scarce training data.
We present CycleGT, an unsupervised training method that can bootstrap from non-parallel graph and text data, and iteratively back translate between the two forms.
arXiv Detail & Related papers (2020-06-08T15:59:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.