Leveraging Large Language Models for Node Generation in Few-Shot Learning on Text-Attributed Graphs
- URL: http://arxiv.org/abs/2310.09872v2
- Date: Tue, 10 Dec 2024 16:06:29 GMT
- Title: Leveraging Large Language Models for Node Generation in Few-Shot Learning on Text-Attributed Graphs
- Authors: Jianxiang Yu, Yuxiang Ren, Chenghua Gong, Jiaqi Tan, Xiang Li, Xuecang Zhang,
- Abstract summary: We propose a plug-and-play approach to empower text-attributed graphs through node generation using Large Language Models (LLMs)<n>LLMs extract semantic information from labels and generate samples that belong to categories as exemplars.<n>We employ an edge predictor to capture structural information inherent in the raw dataset and integrate the newly generated samples into the original graph.
- Score: 5.587264586806575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-attributed graphs have recently garnered significant attention due to their wide range of applications in web domains. Existing methodologies employ word embedding models for acquiring text representations as node features, which are subsequently fed into Graph Neural Networks (GNNs) for training. Recently, the advent of Large Language Models (LLMs) has introduced their powerful capabilities in information retrieval and text generation, which can greatly enhance the text attributes of graph data. Furthermore, the acquisition and labeling of extensive datasets are both costly and time-consuming endeavors. Consequently, few-shot learning has emerged as a crucial problem in the context of graph learning tasks. In order to tackle this challenge, we propose a lightweight paradigm called LLM4NG, which adopts a plug-and-play approach to empower text-attributed graphs through node generation using LLMs. Specifically, we utilize LLMs to extract semantic information from the labels and generate samples that belong to these categories as exemplars. Subsequently, we employ an edge predictor to capture the structural information inherent in the raw dataset and integrate the newly generated samples into the original graph. This approach harnesses LLMs for enhancing class-level information and seamlessly introduces labeled nodes and edges without modifying the raw dataset, thereby facilitating the node classification task in few-shot scenarios. Extensive experiments demonstrate the outstanding performance of our proposed paradigm, particularly in low-shot scenarios. For instance, in the 1-shot setting of the ogbn-arxiv dataset, LLM4NG achieves a 76% improvement over the baseline model.
Related papers
- Training Large Recommendation Models via Graph-Language Token Alignment [53.3142545812349]
We propose a novel framework to train Large Recommendation models via Graph-Language Token Alignment.
By aligning item and user nodes from the interaction graph with pretrained LLM tokens, GLTA effectively leverages the reasoning abilities of LLMs.
Furthermore, we introduce Graph-Language Logits Matching (GLLM) to optimize token alignment for end-to-end item prediction.
arXiv Detail & Related papers (2025-02-26T02:19:10Z) - Deep Semantic Graph Learning via LLM based Node Enhancement [5.312946761836463]
Large Language Models (LLMs) have demonstrated superior capabilities in understanding text semantics.
This paper proposes a novel framework that combines Graph Transformer architecture with LLM-enhanced node features.
arXiv Detail & Related papers (2025-02-11T21:55:46Z) - Large Language Model-based Augmentation for Imbalanced Node Classification on Text-Attributed Graphs [13.42259312243504]
We propose a novel approach called LA-TAG (LLM-based Augmentation on Text-Attributed Graphs)
It prompts Large Language Models to generate synthetic texts based on existing node texts in the graph.
To integrate these synthetic text-attributed nodes into the graph, we introduce a text-based link predictor.
arXiv Detail & Related papers (2024-10-22T10:36:15Z) - Let's Ask GNN: Empowering Large Language Model for Graph In-Context Learning [28.660326096652437]
We introduce AskGNN, a novel approach that bridges the gap between sequential text processing and graph-structured data.
AskGNN employs a Graph Neural Network (GNN)-powered structure-enhanced retriever to select labeled nodes across graphs.
Experiments across three tasks and seven LLMs demonstrate AskGNN's superior effectiveness in graph task performance.
arXiv Detail & Related papers (2024-10-09T17:19:12Z) - Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs)
We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z) - All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data.
E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z) - STAGE: Simplified Text-Attributed Graph Embeddings Using Pre-trained LLMs [1.4624458429745086]
We present a method for enhancing node features in Graph Neural Network (GNN) models that encode Text-Attributed Graphs (TAGs)
Our approach leverages Large-Language Models (LLMs) to generate embeddings for textual attributes.
We show that utilizing pre-trained LLMs as embedding generators provides robust features for ensemble GNN training.
arXiv Detail & Related papers (2024-07-10T08:50:25Z) - A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware.
Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning.
We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt.
We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z) - Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs.
We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs.
We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z) - Exploring the Potential of Large Language Models (LLMs) in Learning on
Graphs [59.74814230246034]
Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities.
We investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors.
arXiv Detail & Related papers (2023-07-07T05:31:31Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.