Enginuity: Building an Open Multi-Domain Dataset of Complex Engineering Diagrams
- URL: http://arxiv.org/abs/2601.13299v1
- Date: Mon, 19 Jan 2026 18:54:50 GMT
- Title: Enginuity: Building an Open Multi-Domain Dataset of Complex Engineering Diagrams
- Authors: Ethan Seefried, Prahitha Movva, Naga Harshita Marupaka, Tilak Kasturi, Tirthankar Ghosal,
- Abstract summary: We propose Enginuity - the first open, large-scale, multi-domain engineering diagram dataset with comprehensive structural annotations designed for automated diagram parsing.<n>By capturing hierarchical component relationships, connections, and semantic elements across diverse engineering domains, our proposed dataset would enable multimodal large language models to address critical downstream tasks including structured diagram parsing, cross-modal information retrieval, and AI-assisted engineering simulation.
- Score: 2.8809775760443657
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Enginuity - the first open, large-scale, multi-domain engineering diagram dataset with comprehensive structural annotations designed for automated diagram parsing. By capturing hierarchical component relationships, connections, and semantic elements across diverse engineering domains, our proposed dataset would enable multimodal large language models to address critical downstream tasks including structured diagram parsing, cross-modal information retrieval, and AI-assisted engineering simulation. Enginuity would be transformative for AI for Scientific Discovery by enabling artificial intelligence systems to comprehend and manipulate the visual-structural knowledge embedded in engineering diagrams, breaking down a fundamental barrier that currently prevents AI from fully participating in scientific workflows where diagram interpretation, technical drawing analysis, and visual reasoning are essential for hypothesis generation, experimental design, and discovery.
Related papers
- Towards Agentic Intelligence for Materials Science [73.4576385477731]
This survey advances a unique pipeline-centric view that spans from corpus curation and pretraining to goal-conditioned agents interfacing with simulation and experimental platforms.<n>To bridge communities and establish a shared frame of reference, we first present an integrated lens that aligns terminology, evaluation, and workflow stages across AI and materials science.
arXiv Detail & Related papers (2026-01-29T23:48:43Z) - Artificial Intelligence in Materials Science and Engineering: Current Landscape, Key Challenges, and Future Trajectorie [0.28279056210896714]
AI is becoming an essential competency for materials researchers.<n>We survey the spectrum of machine learning approaches, including CNNs, GNNs, and Transformers, alongside emerging generative AI and probabilistic models.<n>The review also examines the pivotal role of data in this field, emphasizing how effective representation and featurization strategies underpin the performance of machine learning models.
arXiv Detail & Related papers (2026-01-18T19:36:10Z) - Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z) - Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems [75.78934957242403]
Self-driving vehicles and drones require true Spatial Intelligence from multi-modal onboard sensor data.<n>This paper presents a framework for multi-modal pre-training, identifying the core set of techniques driving progress toward this goal.
arXiv Detail & Related papers (2025-12-30T17:58:01Z) - AI4X Roadmap: Artificial Intelligence for the advancement of scientific pursuit and its future directions [65.44445343399126]
We look at AI-enabled science across biology, chemistry, climate science, mathematics, materials science, physics, self-driving laboratories and unconventional computing.<n>Several shared themes emerge: the need for diverse and trustworthy data, transferable electronic-structure and interatomic models, AI systems integrated into end-to-end scientific synthesis.<n>Across domains, we highlight how large foundation models, active learning and self-driving laboratories can close loops between prediction and validation.
arXiv Detail & Related papers (2025-11-26T02:10:28Z) - Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models [10.1080193179562]
Current understanding models excel at recognizing "what" but fall short in high-level cognitive tasks like causal reasoning and future prediction.<n>We propose a novel framework that fuses a powerful Vision Foundation Model for deep visual perception with a Large Language Model (LLM) serving as a knowledge-driven reasoning core.
arXiv Detail & Related papers (2025-07-08T09:43:17Z) - Spatial Understanding from Videos: Structured Prompts Meet Simulation Data [89.77871049500546]
We present a unified framework for enhancing 3D spatial reasoning in pre-trained vision-language models without modifying their architecture.<n>This framework combines SpatialMind, a structured prompting strategy that decomposes complex scenes and questions into interpretable reasoning steps, with ScanForgeQA, a scalable question-answering dataset built from diverse 3D simulation scenes.
arXiv Detail & Related papers (2025-06-04T07:36:33Z) - Toward Knowledge-Guided AI for Inverse Design in Manufacturing: A Perspective on Domain, Physics, and Human-AI Synergy [0.8399688944263844]
AI is reshaping inverse design in manufacturing, enabling high-performance discovery in materials, products, and processes.<n>However, purely data-driven approaches often struggle in realistic manufacturing settings characterized by sparse data, high-dimensional design spaces, and complex constraints.<n>This perspective proposes an integrated framework built on three complementary pillars: domain knowledge to establish physically meaningful objectives and constraints while removing variables with limited relevance, physics-informed machine learning to enhance generalization under limited or biased data, and large language model-based interfaces to support intuitive, human-centered interaction.
arXiv Detail & Related papers (2025-05-29T08:15:27Z) - Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications [17.571124565519263]
In-depth exploration of object detection and semantic segmentation is provided.<n>State-of-the-art advancements in machine learning and deep learning are reviewed.<n>Analysis of big data processing is presented.
arXiv Detail & Related papers (2024-10-21T02:10:49Z) - Customized Information and Domain-centric Knowledge Graph Construction with Large Language Models [0.0]
We propose a novel approach based on knowledge graphs to provide timely access to structured information.
Our framework encompasses a text mining process, which includes information retrieval, keyphrase extraction, semantic network creation, and topic map visualization.
We apply our methodology to the domain of automotive electrical systems to demonstrate the approach, which is scalable.
arXiv Detail & Related papers (2024-09-30T07:08:28Z) - Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - Understanding Generative AI Content with Embedding Models [4.662332573448995]
We show that deep neural networks (DNNs) implicitly engineer features by transforming their input data into hidden feature vectors called embeddings.<n>We find empirical evidence that there is intrinsic separability between real samples and those generated by artificial intelligence (AI)
arXiv Detail & Related papers (2024-08-19T22:07:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.