GNN-LM: Language Modeling based on Global Contexts via GNN
- URL: http://arxiv.org/abs/2110.08743v1
- Date: Sun, 17 Oct 2021 07:18:21 GMT
- Title: GNN-LM: Language Modeling based on Global Contexts via GNN
- Authors: Yuxian Meng, Shi Zong, Xiaoya Li, Xiaofei Sun, Tianwei Zhang, Fei Wu,
Jiwei Li
- Abstract summary: We introduce GNN-LM, which extends the vanilla neural language model (LM) by allowing to reference similar contexts in the entire training corpus.
GNN-LM achieves a new state-of-the-art perplexity of 14.8 on WikiText-103.
- Score: 32.52117529283929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inspired by the notion that ``{\it to copy is easier than to memorize}``, in
this work, we introduce GNN-LM, which extends the vanilla neural language model
(LM) by allowing to reference similar contexts in the entire training corpus.
We build a directed heterogeneous graph between an input context and its
semantically related neighbors selected from the training corpus, where nodes
are tokens in the input context and retrieved neighbor contexts, and edges
represent connections between nodes. Graph neural networks (GNNs) are
constructed upon the graph to aggregate information from similar contexts to
decode the token. This learning paradigm provides direct access to the
reference contexts and helps improve a model's generalization ability. We
conduct comprehensive experiments to validate the effectiveness of the GNN-LM:
GNN-LM achieves a new state-of-the-art perplexity of 14.8 on WikiText-103 (a
4.5 point improvement over its counterpart of the vanilla LM model) and shows
substantial improvement on One Billion Word and Enwiki8 datasets against strong
baselines. In-depth ablation studies are performed to understand the mechanics
of GNN-LM.
Related papers
- Can Large Language Models Act as Ensembler for Multi-GNNs? [6.387816922598151]
Graph Neural Networks (GNNs) have emerged as powerful models for learning from graph-structured data.
GNNs lack the inherent semantic understanding capability of rich textual nodesattributes, limiting their effectiveness in applications.
This research advances text-attributed graph ensemble learning by providing a robust, superior solution for integrating semantic and structural information.
arXiv Detail & Related papers (2024-10-22T08:48:52Z) - Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs)
We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z) - PROXI: Challenging the GNNs for Link Prediction [3.8233569758620063]
We introduce PROXI, which leverages proximity information of node pairs in both graph and attribute spaces.
Standard machine learning (ML) models perform competitively, even outperforming cutting-edge GNN models.
We show that augmenting traditional GNNs with PROXI significantly boosts their link prediction performance.
arXiv Detail & Related papers (2024-10-02T17:57:38Z) - All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data.
E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z) - Dr.E Bridges Graphs with Large Language Models through Words [12.22063024099311]
We introduce an end-to-end modality-aligning framework for LLM-graph alignment: Dual-Residual Vector Quantized-Variational AutoEncoder.
Our approach is purposefully designed to facilitate token-level alignment with LLMs, enabling an effective translation of the intrinsic '' of graphs into comprehensible natural language.
arXiv Detail & Related papers (2024-06-19T16:43:56Z) - LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework [30.54068909225463]
We aim to streamline the GNN design process and leverage the advantages of Large Language Models (LLMs) to improve the performance of GNNs on downstream tasks.
We formulate a new paradigm, coined "LLMs-as-Consultants," which integrates LLMs with GNNs in an interactive manner.
We empirically evaluate the effectiveness of LOGIN on node classification tasks across both homophilic and heterophilic graphs.
arXiv Detail & Related papers (2024-05-22T18:17:20Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z) - Graph Neural Networks for Natural Language Processing: A Survey [64.36633422999905]
We present a comprehensive overview onGraph Neural Networks (GNNs) for Natural Language Processing.
We propose a new taxonomy of GNNs for NLP, which organizes existing research of GNNs for NLP along three axes: graph construction,graph representation learning, and graph based encoder-decoder models.
arXiv Detail & Related papers (2021-06-10T23:59:26Z) - InsertGNN: Can Graph Neural Networks Outperform Humans in TOEFL Sentence
Insertion Problem? [66.70154236519186]
Sentence insertion is a delicate but fundamental NLP problem.
Current approaches in sentence ordering, text coherence, and question answering (QA) are neither suitable nor good at solving it.
We propose InsertGNN, a model that represents the problem as a graph and adopts the graph Neural Network (GNN) to learn the connection between sentences.
arXiv Detail & Related papers (2021-03-28T06:50:31Z) - Policy-GNN: Aggregation Optimization for Graph Neural Networks [60.50932472042379]
Graph neural networks (GNNs) aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors.
It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features.
We propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process.
arXiv Detail & Related papers (2020-06-26T17:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.