Related papers: IGDA: Interactive Graph Discovery through Large Language Model Agents

IGDA: Interactive Graph Discovery through Large Language Model Agents

URL: http://arxiv.org/abs/2502.17189v2
Date: Sun, 13 Apr 2025 16:26:06 GMT
Title: IGDA: Interactive Graph Discovery through Large Language Model Agents
Authors: Alex Havrilla, David Alvarez-Melis, Nicolo Fusi,
Abstract summary: Large language models ($textbfLLMs$) have emerged as a powerful method for discovery.<n>We propose $textbfIGDA$ to be a powerful method for graph discovery complementary to existing numerically driven approaches.
Score: 6.704529554100875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models ($\textbf{LLMs}$) have emerged as a powerful method for discovery. Instead of utilizing numerical data, LLMs utilize associated variable $\textit{semantic metadata}$ to predict variable relationships. Simultaneously, LLMs demonstrate impressive abilities to act as black-box optimizers when given an objective $f$ and sequence of trials. We study LLMs at the intersection of these two capabilities by applying LLMs to the task of $\textit{interactive graph discovery}$: given a ground truth graph $G^*$ capturing variable relationships and a budget of $I$ edge experiments over $R$ rounds, minimize the distance between the predicted graph $\hat{G}_R$ and $G^*$ at the end of the $R$-th round. To solve this task we propose $\textbf{IGDA}$, a LLM-based pipeline incorporating two key components: 1) an LLM uncertainty-driven method for edge experiment selection 2) a local graph update strategy utilizing binary feedback from experiments to improve predictions for unselected neighboring edges. Experiments on eight different real-world graphs show our approach often outperforms all baselines including a state-of-the-art numerical method for interactive graph discovery. Further, we conduct a rigorous series of ablations dissecting the impact of each pipeline component. Finally, to assess the impact of memorization, we apply our interactive graph discovery strategy to a complex, new (as of July 2024) causal graph on protein transcription factors, finding strong performance in a setting where memorization is impossible. Overall, our results show IGDA to be a powerful method for graph discovery complementary to existing numerically driven approaches.

Related papers

Beyond Chunks and Graphs: Retrieval-Augmented Generation through Triplet-Driven Thinking [31.73448933991891]
Retrieval-augmented generation (RAG) is critical for reducing hallucinations and incorporating external knowledge into Large Language Models (LLMs)<n>We propose T$2$RAG, a novel framework that operates on a simple, graph-free knowledge base of atomic triplets.<n> Empirical results show that T$2$RAG significantly outperforms state-of-the-art multi-round and Graph RAG methods.
arXiv Detail & Related papers (2025-08-04T13:50:44Z)
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via $\mathbf{\ exttt{D}}$ual-$\mathbf{\ exttt{H}}$ead $\mathbf{\ exttt{O}}$ptimization [49.2338910653152]
Vision-constrained models (VLMs) have achieved remarkable success across diverse tasks by leveraging rich textual information with minimal labeled data.<n> Knowledge distillation (KD) offers a well-established solution to this problem; however, recent KD approaches from VLMs often involve multi-stage training or additional tuning.<n>We propose $mathbftextttDHO$ -- a simple yet effective KD framework that transfers knowledge from VLMs to compact, task-specific models in semi-language settings.
arXiv Detail & Related papers (2025-05-12T15:39:51Z)
What Do LLMs Need to Understand Graphs: A Survey of Parametric Representation of Graphs [69.48708136448694]
Large language models (LLMs) are reorganizing in the AI community for their expected reasoning and inference abilities.<n>We believe this kind of parametric representation of graphs, graph laws, can be a solution for making LLMs understand graph data as the input.
arXiv Detail & Related papers (2024-10-16T00:01:31Z)
Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs [12.878608250420832]
Retrieval-augmented generation (RAG) has revitalized Large Language Models (LLMs)<n>We propose $textitgraph of records$ ($textbfGoR$) to enhance RAG for long-context global summarization.<n>GoR features a $textitgraph neural network$ and an elaborately designed $textitBERTScore$-based objective for self-supervised model training.
arXiv Detail & Related papers (2024-10-14T18:34:29Z)
Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models [88.4320775961431]
We introduce ProGraph, a benchmark for large language models (LLMs) to process graphs. Our findings reveal that the performance of current LLMs is unsatisfactory, with the best model achieving only 36% accuracy. We propose LLM4Graph datasets, which include crawled documents and auto-generated codes based on 6 widely used graph libraries.
arXiv Detail & Related papers (2024-09-29T11:38:45Z)
GLBench: A Comprehensive Benchmark for Graph with Large Language Models [41.89444363336435]
We introduce GLBench, the first comprehensive benchmark for evaluating GraphLLM methods in both supervised and zero-shot scenarios. GLBench provides a fair and thorough evaluation of different categories of GraphLLM methods, along with traditional baselines such as graph neural networks.
arXiv Detail & Related papers (2024-07-10T08:20:47Z)
Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware. Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning. We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt. We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z)
GSINA: Improving Subgraph Extraction for Graph Invariant Learning via Graph Sinkhorn Attention [52.67633391931959]
Graph invariant learning (GIL) has been an effective approach to discovering the invariant relationships between graph data and its labels. We propose a novel graph attention mechanism called Graph Sinkhorn Attention (GSINA) GSINA is able to obtain meaningful, differentiable invariant subgraphs with controllable sparsity and softness.
arXiv Detail & Related papers (2024-02-11T12:57:16Z)
Integrating Graphs with Large Language Models: Methods and Prospects [68.37584693537555]
Large language models (LLMs) have emerged as frontrunners, showcasing unparalleled prowess in diverse applications. Merging the capabilities of LLMs with graph-structured data has been a topic of keen interest. This paper bifurcates such integrations into two predominant categories.
arXiv Detail & Related papers (2023-10-09T07:59:34Z)
Exploring the Potential of Large Language Models (LLMs) in Learning on Graphs [59.74814230246034]
Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities. We investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors.
arXiv Detail & Related papers (2023-07-07T05:31:31Z)
Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPT [10.879701971582502]
We aim to develop a large language model (LLM) with the reasoning ability on complex graph data. Inspired by the latest ChatGPT and Toolformer models, we propose the Graph-ToolFormer framework to teach LLMs themselves with prompts augmented by ChatGPT to use external graph reasoning API tools.
arXiv Detail & Related papers (2023-04-10T05:25:54Z)
Collaborative likelihood-ratio estimation over graphs [55.98760097296213]
Graph-based Relative Unconstrained Least-squares Importance Fitting (GRULSIF) We develop this idea in a concrete non-parametric method that we call Graph-based Relative Unconstrained Least-squares Importance Fitting (GRULSIF) We derive convergence rates for our collaborative approach that highlights the role played by variables such as the number of available observations per node, the size of the graph, and how accurately the graph structure encodes the similarity between tasks.
arXiv Detail & Related papers (2022-05-28T15:37:03Z)
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT) GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information. We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z)
Intervention Efficient Algorithms for Approximate Learning of Causal Graphs [22.401163479802094]
We study the problem of learning the causal relationships between a set of observed variables in the presence of latents. Our goal is to recover the directions of all causal or ancestral relations in $G$, via a minimum cost set of interventions. Our algorithms combine work on efficient intervention design and the design of low-cost separating set systems.
arXiv Detail & Related papers (2020-12-27T17:08:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.