Related papers: A Graph Talks, But Who's Listening? Rethinking Evaluations for Graph-Language Models

A Graph Talks, But Who's Listening? Rethinking Evaluations for Graph-Language Models

URL: http://arxiv.org/abs/2508.20583v1
Date: Thu, 28 Aug 2025 09:20:47 GMT
Title: A Graph Talks, But Who's Listening? Rethinking Evaluations for Graph-Language Models
Authors: Soham Petkar, Hari Aakash K, Anirudh Vempati, Akshit Sinha, Ponnurangam Kumarauguru, Chirag Agarwal,
Abstract summary: Developments in Graph-Language Models (GLMs) aim to integrate the structural reasoning capabilities of Graph Neural Networks (GNNs) with the semantic understanding of Large Language Models (LLMs)<n>We demonstrate that current evaluation benchmarks for GLMs are insufficient to assess multimodal reasoning.<n>We introduce the CLEGR(Compositional Language-Graph Reasoning) benchmark, designed to evaluate multimodal reasoning at various complexity levels.
Score: 11.808687414968388
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developments in Graph-Language Models (GLMs) aim to integrate the structural reasoning capabilities of Graph Neural Networks (GNNs) with the semantic understanding of Large Language Models (LLMs). However, we demonstrate that current evaluation benchmarks for GLMs, which are primarily repurposed node-level classification datasets, are insufficient to assess multimodal reasoning. Our analysis reveals that strong performance on these benchmarks is achievable using unimodal information alone, suggesting that they do not necessitate graph-language integration. To address this evaluation gap, we introduce the CLEGR(Compositional Language-Graph Reasoning) benchmark, designed to evaluate multimodal reasoning at various complexity levels. Our benchmark employs a synthetic graph generation pipeline paired with questions that require joint reasoning over structure and textual semantics. We perform a thorough evaluation of representative GLM architectures and find that soft-prompted LLM baselines perform on par with GLMs that incorporate a full GNN backbone. This result calls into question the architectural necessity of incorporating graph structure into LLMs. We further show that GLMs exhibit significant performance degradation in tasks that require structural reasoning. These findings highlight limitations in the graph reasoning capabilities of current GLMs and provide a foundation for advancing the community toward explicit multimodal reasoning involving graph structure and language.

Related papers

GILT: An LLM-Free, Tuning-Free Graph Foundational Model for In-Context Learning [50.40400074353263]
Graph Neural Networks (GNNs) are powerful tools for precessing relational data but often struggle to generalize to unseen graphs.<n>We introduce textbfGraph textbfIn-context textbfL textbfTransformer (GILT), a framework built on an LLM-free and tuning-free architecture.
arXiv Detail & Related papers (2025-10-06T08:09:15Z)
G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge [88.82814893945077]
Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge.<n>Recent graph-enhanced RAG (GraphRAG) attempts to bridge this gap by constructing tailored graphs and enabling LLMs to reason on them.<n>G-reasoner is a unified framework that integrates graph and language foundation models for reasoning over diverse graph-structured knowledge.
arXiv Detail & Related papers (2025-09-29T04:38:12Z)
Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching [61.824094419641575]
Large Language Models (LLMs) struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA)<n>We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures.<n>Existing methods usually employ resource-intensive, non-scalable reasoning on vanilla KGs, but overlook this gap.<n>We propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries.
arXiv Detail & Related papers (2025-09-25T06:48:52Z)
GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [59.72897499248909]
We propose a novel graph retriever trained end-to-end with Large Language Models (LLMs)<n>Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together.<n>Our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks.
arXiv Detail & Related papers (2025-09-20T02:38:00Z)
Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [75.9865035064794]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z)
Scalability Matters: Overcoming Challenges in InstructGLM with Similarity-Degree-Based Sampling [1.2805157669888096]
We propose SDM-InstructGLM, a novel instruction-tuned Graph Language Model (InstructGLM) framework that enhances scalability and efficiency without relying on GNNs.<n>Our method introduces a similarity-degree-based biased random walk mechanism, which selectively samples and encodes graph information based on node-feature similarity and degree centrality.<n>Our results demonstrate the feasibility of LLM-only graph processing, enabling scalable and interpretable Graph Language Models (GLMs) optimized through instruction-based fine-tuning.
arXiv Detail & Related papers (2025-05-02T06:08:21Z)
Are Large Language Models In-Context Graph Learners? [31.172657860606297]
Large language models (LLMs) have remarkable in-context reasoning capabilities across a wide range of tasks.<n>However, they struggle to handle structured data, such as graphs, due to their lack of understanding of non-Euclidean structures.<n>We show that learning on graph data can be conceptualized as a retrieval-augmented generation (RAG) process.<n>We propose a series of RAG frameworks to enhance the in-context learning capabilities of LLMs for graph learning tasks.
arXiv Detail & Related papers (2025-02-19T09:14:19Z)
How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension [53.6373473053431]
This work introduces a benchmark to assess large language models' capabilities in graph pattern tasks.<n>We have developed a benchmark that evaluates whether LLMs can understand graph patterns based on either terminological or topological descriptions.<n>Our benchmark encompasses both synthetic and real datasets, and a variety of models, with a total of 11 tasks and 7 models.
arXiv Detail & Related papers (2024-10-04T04:48:33Z)
All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data. E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z)
Multi-View Empowered Structural Graph Wordification for Language Models [12.22063024099311]
We introduce an end-to-end modality-aligning framework for LLM-graph alignment: Dual-Residual Vector Quantized-Variational AutoEncoder, namely Dr.E.<n>Our approach is purposefully designed to facilitate token-level alignment with LLMs, enabling an effective translation of the intrinsic'of graphs into comprehensible natural language.<n>Our framework ensures certain visual interpretability, efficiency, and robustness, marking the promising successful endeavor to achieve token-level alignment between LLMs and GNNs.
arXiv Detail & Related papers (2024-06-19T16:43:56Z)
Graph Language Models [18.75364157933661]
We introduce a novel LM type, the Graph Language Model (GLM), that integrates the strengths of both approaches and mitigates their weaknesses. We design the GLM's architecture to incorporate graph biases, thereby promoting effective knowledge distribution within the graph. Empirical evaluations on relation classification tasks show that GLM embeddings surpass both LM- and GNN-based baselines in supervised and zero-shot setting.
arXiv Detail & Related papers (2024-01-13T16:09:49Z)
Disentangled Representation Learning with Large Language Models for Text-Attributed Graphs [57.052160123387104]
We present the Disentangled Graph-Text Learner (DGTL) model, which is able to enhance the reasoning and predicting capabilities of LLMs for TAGs. Our proposed DGTL model incorporates graph structure information through tailored disentangled graph neural network (GNN) layers. Experimental evaluations demonstrate the effectiveness of the proposed DGTL model on achieving superior or comparable performance over state-of-the-art baselines.
arXiv Detail & Related papers (2023-10-27T14:00:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.