Can Language Models Solve Graph Problems in Natural Language?
- URL: http://arxiv.org/abs/2305.10037v3
- Date: Sat, 6 Jan 2024 01:01:38 GMT
- Title: Can Language Models Solve Graph Problems in Natural Language?
- Authors: Heng Wang, Shangbin Feng, Tianxing He, Zhaoxuan Tan, Xiaochuang Han,
Yulia Tsvetkov
- Abstract summary: Large language models (LLMs) are increasingly adopted for a variety of tasks with implicit graphical structures.
We propose NLGraph, a benchmark of graph-based problem solving simulating in natural language.
- Score: 51.28850846990929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are increasingly adopted for a variety of tasks
with implicit graphical structures, such as planning in robotics, multi-hop
question answering or knowledge probing, structured commonsense reasoning, and
more. While LLMs have advanced the state-of-the-art on these tasks with
structure implications, whether LLMs could explicitly process textual
descriptions of graphs and structures, map them to grounded conceptual spaces,
and perform structured operations remains underexplored. To this end, we
propose NLGraph (Natural Language Graph), a comprehensive benchmark of
graph-based problem solving designed in natural language. NLGraph contains
29,370 problems, covering eight graph reasoning tasks with varying complexity
from simple tasks such as connectivity and shortest path up to complex problems
such as maximum flow and simulating graph neural networks. We evaluate LLMs
(GPT-3/4) with various prompting approaches on the NLGraph benchmark and find
that 1) language models do demonstrate preliminary graph reasoning abilities,
2) the benefit of advanced prompting and in-context learning diminishes on more
complex graph problems, while 3) LLMs are also (un)surprisingly brittle in the
face of spurious correlations in graph and problem settings. We then propose
Build-a-Graph Prompting and Algorithmic Prompting, two instruction-based
approaches to enhance LLMs in solving natural language graph problems.
Build-a-Graph and Algorithmic prompting improve the performance of LLMs on
NLGraph by 3.07% to 16.85% across multiple tasks and settings, while how to
solve the most complicated graph reasoning tasks in our setup with language
models remains an open research question. The NLGraph benchmark and evaluation
code are available at https://github.com/Arthur-Heng/NLGraph.
Related papers
- Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models [90.98855064914379]
We introduce ProGraph, a benchmark for large language models (LLMs) to process graphs.
Our findings reveal that the performance of current LLMs is unsatisfactory, with the best model achieving only 36% accuracy.
We propose LLM4Graph datasets, which include crawled documents and auto-generated codes based on 6 widely used graph libraries.
arXiv Detail & Related papers (2024-09-29T11:38:45Z) - Graph Reasoning with Large Language Models via Pseudo-code Prompting [25.469214467011362]
This paper investigates whether prompting via pseudo-code instructions can improve the performance of large language models (LLMs) in solving graph problems.
Our experiments demonstrate that using pseudo-code instructions generally improves the performance of all considered LLMs.
arXiv Detail & Related papers (2024-09-26T14:52:40Z) - CodeGraph: Enhancing Graph Reasoning of LLMs with Code [43.79249671845997]
We introduce CodeGraph, a method that encodes graph problem solutions as code.
We show that CodeGraph can boost performance on graph reasoning tasks inside large language models by 1.3% to 58.6%.
arXiv Detail & Related papers (2024-08-25T15:27:21Z) - Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path [53.71787069694794]
We focus on the graph reasoning ability of Large Language Models (LLMs)
We revisit the ability of LLMs on three fundamental graph tasks: graph description translation, graph connectivity, and the shortest-path problem.
Our findings suggest that LLMs can fail to understand graph structures through text descriptions and exhibit varying performance for all these fundamental tasks.
arXiv Detail & Related papers (2024-08-18T16:26:39Z) - GraphArena: Benchmarking Large Language Models on Graph Computational Problems [25.72820021030033]
"arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to examine their progresses.
We introduce GraphArena, a benchmarking tool to evaluate models on graph computational problems using million-scale real-world graphs.
arXiv Detail & Related papers (2024-06-29T09:19:23Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [60.71360240206726]
Large language models (LLMs) suffer from hallucinations, especially on knowledge-intensive tasks.
Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora.
We propose a framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively.
arXiv Detail & Related papers (2024-04-10T15:41:53Z) - G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [61.93058781222079]
We develop a flexible question-answering framework targeting real-world textual graphs.
We introduce the first retrieval-augmented generation (RAG) approach for general textual graphs.
G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem.
arXiv Detail & Related papers (2024-02-12T13:13:04Z) - Talk like a Graph: Encoding Graphs for Large Language Models [15.652881653332194]
We study the first comprehensive study of encoding graph-structured data as text for consumption by large language models (LLMs)
We show that LLM performance on graph reasoning tasks varies on three fundamental levels: (1) the graph encoding method, (2) the nature of the graph task itself, and (3) interestingly, the very structure of the graph considered.
arXiv Detail & Related papers (2023-10-06T19:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.