Related papers: GNN-LM: Language Modeling based on Global Contexts via GNN

GNN-LM: Language Modeling based on Global Contexts via GNN

URL: http://arxiv.org/abs/2110.08743v1
Date: Sun, 17 Oct 2021 07:18:21 GMT
Title: GNN-LM: Language Modeling based on Global Contexts via GNN
Authors: Yuxian Meng, Shi Zong, Xiaoya Li, Xiaofei Sun, Tianwei Zhang, Fei Wu, Jiwei Li
Abstract summary: We introduce GNN-LM, which extends the vanilla neural language model (LM) by allowing to reference similar contexts in the entire training corpus. GNN-LM achieves a new state-of-the-art perplexity of 14.8 on WikiText-103.
Score: 32.52117529283929
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Inspired by the notion that ``{\it to copy is easier than to memorize}``, in this work, we introduce GNN-LM, which extends the vanilla neural language model (LM) by allowing to reference similar contexts in the entire training corpus. We build a directed heterogeneous graph between an input context and its semantically related neighbors selected from the training corpus, where nodes are tokens in the input context and retrieved neighbor contexts, and edges represent connections between nodes. Graph neural networks (GNNs) are constructed upon the graph to aggregate information from similar contexts to decode the token. This learning paradigm provides direct access to the reference contexts and helps improve a model's generalization ability. We conduct comprehensive experiments to validate the effectiveness of the GNN-LM: GNN-LM achieves a new state-of-the-art perplexity of 14.8 on WikiText-103 (a 4.5 point improvement over its counterpart of the vanilla LM model) and shows substantial improvement on One Billion Word and Enwiki8 datasets against strong baselines. In-depth ablation studies are performed to understand the mechanics of GNN-LM.

Related papers

Refining Interactions: Enhancing Anisotropy in Graph Neural Networks with Language Semantics [6.273224130511677]
We introduce LanSAGNN (Language Semantic Anisotropic Graph Neural Network), a framework that extends the concept of anisotropic GNNs to the natural language level. We propose an efficient dual-layer LLMs finetuning architecture to better align LLMs' outputs with graph tasks.
arXiv Detail & Related papers (2025-04-02T07:32:45Z)
GL-Fusion: Rethinking the Combination of Graph Neural Network and Large Language model [63.774726052837266]
We introduce a new architecture that deeply integrates Graph Neural Networks (GNNs) with Large Language Models (LLMs) We introduce three key innovations: (1) Structure-Aware Transformers, which incorporate GNN's message-passing capabilities directly into LLM's transformer layers; (2) Graph-Text Cross-Attention, which processes full, uncompressed text from graph nodes and edges; and (3) GNN-LLM Twin Predictor, enabling LLM's flexible autoregressive generation alongside GNN's scalable one-pass prediction.
arXiv Detail & Related papers (2024-12-08T05:49:58Z)
Can Large Language Models Act as Ensembler for Multi-GNNs? [6.387816922598151]
Graph Neural Networks (GNNs) have emerged as powerful models for learning from graph-structured data. GNNs lack the inherent semantic understanding capability of rich textual nodesattributes, limiting their effectiveness in applications. This research advances text-attributed graph ensemble learning by providing a robust, superior solution for integrating semantic and structural information.
arXiv Detail & Related papers (2024-10-22T08:48:52Z)
Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs) We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z)
PROXI: Challenging the GNNs for Link Prediction [3.8233569758620063]
We introduce PROXI, which leverages proximity information of node pairs in both graph and attribute spaces. Standard machine learning (ML) models perform competitively, even outperforming cutting-edge GNN models. We show that augmenting traditional GNNs with PROXI significantly boosts their link prediction performance.
arXiv Detail & Related papers (2024-10-02T17:57:38Z)
All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data. E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z)
Dr.E Bridges Graphs with Large Language Models through Words [12.22063024099311]
We introduce an end-to-end modality-aligning framework for LLM-graph alignment: Dual-Residual Vector Quantized-Variational AutoEncoder. Our approach is purposefully designed to facilitate token-level alignment with LLMs, enabling an effective translation of the intrinsic '' of graphs into comprehensible natural language.
arXiv Detail & Related papers (2024-06-19T16:43:56Z)
LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework [30.54068909225463]
We aim to streamline the GNN design process and leverage the advantages of Large Language Models (LLMs) to improve the performance of GNNs on downstream tasks. We formulate a new paradigm, coined "LLMs-as-Consultants," which integrates LLMs with GNNs in an interactive manner. We empirically evaluate the effectiveness of LOGIN on node classification tasks across both homophilic and heterophilic graphs.
arXiv Detail & Related papers (2024-05-22T18:17:20Z)
Vul-LMGNNs: Fusing language models and online-distilled graph neural networks for code vulnerability detection [5.536252767247838]
We propose Vul-LMGNNs, integrating pre-trained codeLMs with Graph Neural Networks (GNNs) to enable cross-layer propagation of semantic and structural information. Vul-LMGNNs leverage Code Property Graphs (CPGs) to incorporate syntax, control flow, and data dependencies, using gated GNNs for structural extraction.
arXiv Detail & Related papers (2024-04-23T03:48:18Z)
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks. Our method achieves state-of-the-art results on well-established TAG datasets. Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)
Graph Neural Networks for Natural Language Processing: A Survey [64.36633422999905]
We present a comprehensive overview onGraph Neural Networks (GNNs) for Natural Language Processing. We propose a new taxonomy of GNNs for NLP, which organizes existing research of GNNs for NLP along three axes: graph construction,graph representation learning, and graph based encoder-decoder models.
arXiv Detail & Related papers (2021-06-10T23:59:26Z)
InsertGNN: Can Graph Neural Networks Outperform Humans in TOEFL Sentence Insertion Problem? [66.70154236519186]
Sentence insertion is a delicate but fundamental NLP problem. Current approaches in sentence ordering, text coherence, and question answering (QA) are neither suitable nor good at solving it. We propose InsertGNN, a model that represents the problem as a graph and adopts the graph Neural Network (GNN) to learn the connection between sentences.
arXiv Detail & Related papers (2021-03-28T06:50:31Z)
Policy-GNN: Aggregation Optimization for Graph Neural Networks [60.50932472042379]
Graph neural networks (GNNs) aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors. It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features. We propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process.
arXiv Detail & Related papers (2020-06-26T17:03:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.