Few-Shot Graph Out-of-Distribution Detection with LLMs
- URL: http://arxiv.org/abs/2503.22097v1
- Date: Fri, 28 Mar 2025 02:37:18 GMT
- Title: Few-Shot Graph Out-of-Distribution Detection with LLMs
- Authors: Haoyan Xu, Zhengtao Yao, Yushun Dong, Ziyi Wang, Ryan A. Rossi, Mengyuan Li, Yue Zhao,
- Abstract summary: We propose a framework that combines the strengths of large language models (LLMs) and graph neural networks (GNNs) to enhance data efficiency in graph out-of-distribution (OOD) detection.<n>We show that LLM-GOOD significantly reduces human annotation costs and outperforms state-of-the-art baselines in terms of both ID classification accuracy and OOD detection performance.
- Score: 34.42512005781724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing methods for graph out-of-distribution (OOD) detection typically depend on training graph neural network (GNN) classifiers using a substantial amount of labeled in-distribution (ID) data. However, acquiring high-quality labeled nodes in text-attributed graphs (TAGs) is challenging and costly due to their complex textual and structural characteristics. Large language models (LLMs), known for their powerful zero-shot capabilities in textual tasks, show promise but struggle to naturally capture the critical structural information inherent to TAGs, limiting their direct effectiveness. To address these challenges, we propose LLM-GOOD, a general framework that effectively combines the strengths of LLMs and GNNs to enhance data efficiency in graph OOD detection. Specifically, we first leverage LLMs' strong zero-shot capabilities to filter out likely OOD nodes, significantly reducing the human annotation burden. To minimize the usage and cost of the LLM, we employ it only to annotate a small subset of unlabeled nodes. We then train a lightweight GNN filter using these noisy labels, enabling efficient predictions of ID status for all other unlabeled nodes by leveraging both textual and structural information. After obtaining node embeddings from the GNN filter, we can apply informativeness-based methods to select the most valuable nodes for precise human annotation. Finally, we train the target ID classifier using these accurately annotated ID nodes. Extensive experiments on four real-world TAG datasets demonstrate that LLM-GOOD significantly reduces human annotation costs and outperforms state-of-the-art baselines in terms of both ID classification accuracy and OOD detection performance.
Related papers
- Refining Interactions: Enhancing Anisotropy in Graph Neural Networks with Language Semantics [6.273224130511677]
We introduce LanSAGNN (Language Semantic Anisotropic Graph Neural Network), a framework that extends the concept of anisotropic GNNs to the natural language level.
We propose an efficient dual-layer LLMs finetuning architecture to better align LLMs' outputs with graph tasks.
arXiv Detail & Related papers (2025-04-02T07:32:45Z) - Leveraging Large Language Models for Effective Label-free Node Classification in Text-Attributed Graphs [10.538099379851198]
Locle is an active self-training framework that does Label-free node Classification with LLMs cost-Effectively.
It iteratively identifies small sets of "critical" samples using GNNs and extracts informative pseudo-labels for them with both LLMs and GNNs.
It significantly outperforms state-of-the-art methods under the same query budget to LLMs in terms of label-free node classification.
arXiv Detail & Related papers (2024-12-16T17:04:40Z) - LEGO-Learn: Label-Efficient Graph Open-Set Learning [46.62885412695813]
Graph open-set learning (GOL) and out-of-distribution (OOD) detection aim to address this challenge by training models that can accurately classify known, in-distribution (ID) classes.
It is critical for high-stakes, real-world applications where models frequently encounter unexpected data, including finance, security, and healthcare.
We propose LEGO-Learn, a novel framework that tackles open-set node classification on graphs within a given label budget by selecting the most informative ID nodes.
arXiv Detail & Related papers (2024-10-21T18:01:11Z) - All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data.
E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z) - Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware.
Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning.
We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt.
We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z) - Similarity-based Neighbor Selection for Graph LLMs [43.176381523196426]
We introduce Similarity-based Neighbor Selection (SNS)
SNS improves the quality of selected neighbors, thereby improving graph representation and alleviating issues like over-squashing and heterophily.
As an inductive and training-free approach, SNS demonstrates superior generalization and scalability over traditional GNN methods.
arXiv Detail & Related papers (2024-02-06T05:29:05Z) - Label-free Node Classification on Graphs with Large Language Models
(LLMS) [46.937442239949256]
This work introduces a label-free node classification on graphs with Large Language Models pipeline, LLM-GNN.
Itates the strengths of both GNNs and LLMs while mitigating their limitations.
In particular, LLM-GNN can achieve an accuracy of 74.9% on a vast-scale dataset with a cost less than 1 dollar.
arXiv Detail & Related papers (2023-10-07T03:14:11Z) - Exploring the Potential of Large Language Models (LLMs) in Learning on
Graphs [59.74814230246034]
Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities.
We investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors.
arXiv Detail & Related papers (2023-07-07T05:31:31Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.