Beyond Pixels: Vector-to-Graph Transformation for Reliable Schematic Auditing
- URL: http://arxiv.org/abs/2602.11678v1
- Date: Thu, 12 Feb 2026 07:50:49 GMT
- Title: Beyond Pixels: Vector-to-Graph Transformation for Reliable Schematic Auditing
- Authors: Chengwei Ma, Zhen Tian, Zhou Zhou, Zhixian Xu, Xiaowei Zhu, Xia Hua, Si Shi, F. Richard Yu,
- Abstract summary: We propose a Vector-to-Graph (V2G) pipeline that converts CAD diagrams into property graphs where nodes represent components and edges encode connectivity.<n>On a diagnostic benchmark of electrical compliance checks, V2G yields large accuracy gains across all error categories, while leading MLLMs remain near chance level.
- Score: 34.54168175788343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal Large Language Models (MLLMs) have shown remarkable progress in visual understanding, yet they suffer from a critical limitation: structural blindness. Even state-of-the-art models fail to capture topology and symbolic logic in engineering schematics, as their pixel-driven paradigm discards the explicit vector-defined relations needed for reasoning. To overcome this, we propose a Vector-to-Graph (V2G) pipeline that converts CAD diagrams into property graphs where nodes represent components and edges encode connectivity, making structural dependencies explicit and machine-auditable. On a diagnostic benchmark of electrical compliance checks, V2G yields large accuracy gains across all error categories, while leading MLLMs remain near chance level. These results highlight the systemic inadequacy of pixel-based methods and demonstrate that structure-aware representations provide a reliable path toward practical deployment of multimodal AI in engineering domains. To facilitate further research, we release our benchmark and implementation at https://github.com/gm-embodied/V2G-Audit.
Related papers
- VecFormer: Towards Efficient and Generalizable Graph Transformer with Graph Token Attention [61.96837866507746]
VecFormer is an efficient and highly generalizable model for node classification.<n>VecFormer outperforms the existing Graph Transformer in both performance and speed.
arXiv Detail & Related papers (2026-02-23T09:10:39Z) - Plain Transformers are Surprisingly Powerful Link Predictors [57.01966734467712]
Link prediction is a core challenge in graph machine learning, demanding models that capture rich and complex topological dependencies.<n>While Graph Neural Networks (GNNs) are the standard solution, state-of-the-art pipelines often rely on explicit structurals or memory-intensive node embeddings.<n>We present PENCIL, an encoder-only plain Transformer that replaces hand-crafted priors with attention over sampled local subgraphs.
arXiv Detail & Related papers (2026-02-02T02:45:52Z) - Integrating Artificial Intelligence and Mixed Integer Linear Programming: Explainable Graph-Based Instance Space Analysis in Air Transportation [0.0]
This paper analyzes the integration of artificial intelligence (AI) with mixed integer linear programming (MILP) to address complex optimization challenges in air transportation with explainability.<n>The study aims to validate the use of Graph Neural Networks (GNNs) for extracting structural feature embeddings from MILP instances.
arXiv Detail & Related papers (2025-12-01T14:03:29Z) - FVDebug: An LLM-Driven Debugging Assistant for Automated Root Cause Analysis of Formal Verification Failures [8.530369312832084]
We present FV Debug, an intelligent system that transforms failure traces into actionable insights.<n>Our approach features a novel pipeline: (1) Causal Graph Synthesis that structures failure traces into directed acyclic graphs, (2) Graph Scanner using batched Large Language Model (LLM) analysis with for-and-against prompting to identify suspicious nodes, and (3) Insight Rover leveraging agentic narrative exploration to generate high-level causal explanations.
arXiv Detail & Related papers (2025-09-16T20:22:10Z) - Explainable Attention-Guided Stacked Graph Neural Networks for Malware Detection [2.6436521007616114]
We propose a novel stacking ensemble framework for graph-based malware detection and explanation.<n>Our framework improves classification performance while providing insightful interpretations of malware behavior.
arXiv Detail & Related papers (2025-08-13T13:33:02Z) - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z) - Scalable Graph Generative Modeling via Substructure Sequences [50.32639806800683]
We introduce Generative Graph Pattern Machine (G$2$PM), a generative Transformer pre-training framework for graphs.<n>G$2$PM represents graph instances (nodes, edges, or entire graphs) as sequences of substructures.<n>It employs generative pre-training over the sequences to learn generalizable and transferable representations.
arXiv Detail & Related papers (2025-05-22T02:16:34Z) - Efficient Environmental Claim Detection with Hyperbolic Graph Neural Networks [1.7259898169307608]
We explore Graph Neural Networks (GNNs) and Hyperbolic Graph Neural Networks (HGNNs) as lightweight yet effective alternatives to transformer-based models.<n>Our results show that our graph-based models, particularly HGNNs in the poincar'e space (P-HGNNs), achieve performance superior to the state-of-the-art on environmental claim detection.
arXiv Detail & Related papers (2025-02-19T11:04:59Z) - LASE: Learned Adjacency Spectral Embeddings [9.227991604045416]
We learn nodal Adjacency Spectral Embeddings (ASE) from graph inputs.<n>LASE is interpretable, parameter efficient, robust to inputs with unobserved edges.<n>LASE layers combine Graph Convolutional Network (GCN) and fully-connected Graph Attention Network (GAT) modules.
arXiv Detail & Related papers (2024-12-23T17:35:19Z) - Graph Structure Refinement with Energy-based Contrastive Learning [56.957793274727514]
We introduce an unsupervised method based on a joint of generative training and discriminative training to learn graph structure and representation.<n>We propose an Energy-based Contrastive Learning (ECL) guided Graph Structure Refinement (GSR) framework, denoted as ECL-GSR.<n>ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efficiency in downstream tasks.
arXiv Detail & Related papers (2024-12-20T04:05:09Z) - A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.