Related papers: VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models

VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models

URL: http://arxiv.org/abs/2411.14832v1
Date: Fri, 22 Nov 2024 10:10:53 GMT
Title: VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models
Authors: Camilo Chacón Sartori, Christian Blum, Filippo Bistaffa,
Abstract summary: Large Vision-Language Models (LVLMs) are increasingly capable of tackling abstract visual tasks. We introduce VisGraphVar, a customizable benchmark generator able to produce graph images for seven task categories. We show that variations in visual attributes of images (e.g., node labeling and layout) and the deliberate inclusion of visual imperfections significantly affect model performance.
Score: 1.597617022056624
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The fast advancement of Large Vision-Language Models (LVLMs) has shown immense potential. These models are increasingly capable of tackling abstract visual tasks. Geometric structures, particularly graphs with their inherent flexibility and complexity, serve as an excellent benchmark for evaluating these models' predictive capabilities. While human observers can readily identify subtle visual details and perform accurate analyses, our investigation reveals that state-of-the-art LVLMs exhibit consistent limitations in specific visual graph scenarios, especially when confronted with stylistic variations. In response to these challenges, we introduce VisGraphVar (Visual Graph Variability), a customizable benchmark generator able to produce graph images for seven distinct task categories (detection, classification, segmentation, pattern recognition, link prediction, reasoning, matching), designed to systematically evaluate the strengths and limitations of individual LVLMs. We use VisGraphVar to produce 990 graph images and evaluate six LVLMs, employing two distinct prompting strategies, namely zero-shot and chain-of-thought. The findings demonstrate that variations in visual attributes of images (e.g., node labeling and layout) and the deliberate inclusion of visual imperfections, such as overlapping nodes, significantly affect model performance. This research emphasizes the importance of a comprehensive evaluation across graph-related tasks, extending beyond reasoning alone. VisGraphVar offers valuable insights to guide the development of more reliable and robust systems capable of performing advanced visual graph analysis.

Related papers

A Comparative Study of Scanpath Models in Graph-Based Visualization [7.592272924252313]
Eye-tracking (ET) data presents challenges related to cost, privacy, and scalability. In our study, we conducted an ET experiment with 40 participants who analyzed graphs. We compared human scanpaths with synthetic ones generated by models such as DeepGaze, UMSS, and Gazeformer.
arXiv Detail & Related papers (2025-03-31T14:43:42Z)
Towards Understanding Graphical Perception in Large Multimodal Models [80.44471730672801]
We leverage the theory of graphical perception to develop an evaluation framework for analyzing gaps in LMMs' perception abilities in charts. We apply our framework to evaluate and diagnose the perception capabilities of state-of-the-art LMMs at three levels (chart, visual element, and pixel)
arXiv Detail & Related papers (2025-03-13T20:13:39Z)
Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs) This framework provides a standardized setting to evaluate GNNs across diverse datasets. We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z)
Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees [50.78679002846741]
We introduce a novel approach for learning cross-task generalities in graphs. We propose task-trees as basic learning instances to align task spaces on graphs. Our findings indicate that when a graph neural network is pretrained on diverse task-trees, it acquires transferable knowledge.
arXiv Detail & Related papers (2024-12-21T02:07:43Z)
Scalable Weibull Graph Attention Autoencoder for Modeling Document Networks [50.42343781348247]
We develop a graph Poisson factor analysis (GPFA) which provides analytic conditional posteriors to improve the inference accuracy. We also extend GPFA to a multi-stochastic-layer version named graph Poisson gamma belief network (GPGBN) to capture the hierarchical document relationships at multiple semantic levels. Our models can extract high-quality hierarchical latent document representations and achieve promising performance on various graph analytic tasks.
arXiv Detail & Related papers (2024-10-13T02:22:14Z)
How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension [53.6373473053431]
This work introduces a benchmark to assess large language models' capabilities in graph pattern tasks. We have developed a benchmark that evaluates whether LLMs can understand graph patterns based on either terminological or topological descriptions. Our benchmark encompasses both synthetic and real datasets, and a variety of models, with a total of 11 tasks and 7 models.
arXiv Detail & Related papers (2024-10-04T04:48:33Z)
Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies [7.067145619709089]
This study investigates the impact of graph visualisations on Large Language Models (LLMs) performance. Our experiments compare the effectiveness of multimodal approaches against purely textual graph representations.
arXiv Detail & Related papers (2024-09-13T14:26:58Z)
GraphInsight: Unlocking Insights in Large Language Models for Graph Structure Understanding [17.724492441325165]
Large Language Models (LLMs) struggle with comprehending graphical structure information through prompts of graph description sequences. We propose GraphInsight, a novel framework aimed at improving LLMs' comprehension of both macro- and micro-level graphical information.
arXiv Detail & Related papers (2024-09-05T05:34:16Z)
Disentangled Generative Graph Representation Learning [51.59824683232925]
This paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework. It aims to learn latent disentangled factors and utilize them to guide graph mask modeling. Experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods.
arXiv Detail & Related papers (2024-08-24T05:13:02Z)
MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining [41.19687587548107]
Graph Neural Networks (GNNs) need to be re-trained every time when applied to different graph tasks and datasets. We propose a novel framework MuseGraph, which seamlessly integrates the strengths of GNNs and Large Language Models (LLMs) Our experimental results demonstrate significant improvements in different graph tasks.
arXiv Detail & Related papers (2024-03-02T09:27:32Z)
Variational Graph Generator for Multi-View Graph Clustering [13.721803208437755]
We propose Variational Graph Generator for Multi-View Graph Clustering (VGMGC) A novel variational graph generator is proposed to infer a reliable variational consensus graph based on a priori assumption over multiple graphs. A simple yet effective graph encoder in conjunction with the multi-view clustering objective is presented to learn the desired graph embeddings for clustering.
arXiv Detail & Related papers (2022-10-13T13:19:51Z)
Towards Graph Self-Supervised Learning with Contrastive Adjusted Zooming [48.99614465020678]
We introduce a novel self-supervised graph representation learning algorithm via Graph Contrastive Adjusted Zooming. This mechanism enables G-Zoom to explore and extract self-supervision signals from a graph from multiple scales. We have conducted extensive experiments on real-world datasets, and the results demonstrate that our proposed model outperforms state-of-the-art methods consistently.
arXiv Detail & Related papers (2021-11-20T22:45:53Z)
Visual Distant Supervision for Scene Graph Generation [66.10579690929623]
Scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation. We propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data. Comprehensive experimental results show that our distantly supervised model outperforms strong weakly supervised and semi-supervised baselines.
arXiv Detail & Related papers (2021-03-29T06:35:24Z)
Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning. We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels. Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.