Towards Building General Purpose Embedding Models for Industry 4.0 Agents
- URL: http://arxiv.org/abs/2506.12607v1
- Date: Sat, 14 Jun 2025 19:02:07 GMT
- Title: Towards Building General Purpose Embedding Models for Industry 4.0 Agents
- Authors: Christodoulos Constantinides, Shuxin Lin, Dhaval Patel,
- Abstract summary: We focus on improving language models' understanding for asset maintenance to guide the engineer's decisions and minimize asset downtime.<n>Given a set of tasks expressed in natural language for Industry 4.0 domain, each associated with queries related to a specific asset, we want to recommend relevant items and generalize queries of similar assets.<n>Our approach begins with gathering a qualitative, expert-vetted knowledge base to construct nine asset-specific task datasets.
- Score: 5.212780106286918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we focus on improving language models' understanding for asset maintenance to guide the engineer's decisions and minimize asset downtime. Given a set of tasks expressed in natural language for Industry 4.0 domain, each associated with queries related to a specific asset, we want to recommend relevant items and generalize to queries of similar assets. A task may involve identifying relevant sensors given a query about an asset's failure mode. Our approach begins with gathering a qualitative, expert-vetted knowledge base to construct nine asset-specific task datasets. To create more contextually informed embeddings, we augment the input tasks using Large Language Models (LLMs), providing concise descriptions of the entities involved in the queries. This embedding model is then integrated with a Reasoning and Acting agent (ReAct), which serves as a powerful tool for answering complex user queries that require multi-step reasoning, planning, and knowledge inference. Through ablation studies, we demonstrate that: (a) LLM query augmentation improves the quality of embeddings, (b) Contrastive loss and other methods that avoid in-batch negatives are superior for datasets with queries related to many items, and (c) It is crucial to balance positive and negative in-batch samples. After training and testing on our dataset, we observe a substantial improvement: HIT@1 increases by +54.2%, MAP@100 by +50.1%, and NDCG@10 by +54.7%, averaged across all tasks and models. Additionally, we empirically demonstrate the model's planning and tool invocation capabilities when answering complex questions related to industrial asset maintenance, showcasing its effectiveness in supporting Subject Matter Experts (SMEs) in their day-to-day operations.
Related papers
- OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks [52.87238755666243]
We present OmniEAR, a framework for evaluating how language models reason about physical interactions, tool usage, and multi-agent coordination in embodied tasks.<n>We model continuous physical properties and complex spatial relationships across 1,500 scenarios spanning household and industrial domains.<n>Our systematic evaluation reveals severe performance degradation when models must reason from constraints.
arXiv Detail & Related papers (2025-08-07T17:54:15Z) - Leveraging Knowledge Graphs and LLM Reasoning to Identify Operational Bottlenecks for Warehouse Planning Assistance [1.2749527861829046]
Our framework integrates Knowledge Graphs (KGs) and Large Language Model (LLM)-based agents.<n>It transforms raw DES data into a semantically rich KG, capturing relationships between simulation events and entities.<n>An LLM-based agent uses iterative reasoning, generating interdependent sub-questions. For each sub-question, it creates Cypher queries for KG interaction, extracts information, and self-reflects to correct errors.
arXiv Detail & Related papers (2025-07-23T07:18:55Z) - Enhancing Talent Employment Insights Through Feature Extraction with LLM Finetuning [0.0]
We develop a robust pipeline to identify variables such as remote work availability, remuneration structures, educational requirements, and work experience preferences.<n>Our methodology combines semantic chunking, retrieval-augmented generation (RAG), and fine-tuning DistilBERT models to overcome the limitations of traditional parsing tools.<n>We present a comprehensive evaluation of our fine-tuned models and analyze their strengths, limitations, and potential for scaling.
arXiv Detail & Related papers (2025-01-13T19:49:49Z) - MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale [66.73529246309033]
multimodal large language models (MLLMs) have shown significant potential in a broad range of multimodal tasks.<n>Existing instruction-tuning datasets only provide phrase-level answers without any intermediate rationales.<n>We introduce a scalable and cost-effective method to construct a large-scale multimodal instruction-tuning dataset with rich intermediate rationales.
arXiv Detail & Related papers (2024-12-06T18:14:24Z) - Leverage Task Context for Object Affordance Ranking [57.59106517732223]
We build the first large-scale task-oriented affordance ranking dataset with 25 common tasks, over 50k images and more than 661k objects.
Results demonstrate the feasibility of the task context based affordance learning paradigm and the superiority of our model over state-of-the-art models in the fields of saliency ranking and multimodal object detection.
arXiv Detail & Related papers (2024-11-25T04:22:33Z) - AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning [93.96463520716759]
Large language model (LLM) agents have demonstrated impressive capabilities in utilizing external tools and knowledge to boost accuracy and hallucinations.
Here, we introduce AvaTaR, a novel and automated framework that optimize an LLM agent to effectively leverage provided tools, improving performance on a given task.
arXiv Detail & Related papers (2024-06-17T04:20:02Z) - TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools [51.576974932743596]
Large Language Models (LLMs) often do not perform well on queries that require the aggregation of information across texts.
TACT contains challenging instructions that demand stitching information scattered across one or more texts.
We construct this dataset by leveraging an existing dataset of texts and their associated tables.
We demonstrate that all contemporary LLMs perform poorly on this dataset, achieving an accuracy below 38%.
arXiv Detail & Related papers (2024-06-05T20:32:56Z) - Unified machine learning tasks and datasets for enhancing renewable
energy [0.8356833388425764]
We introduce the ETT-17 (Energy Transition Tasks-17), a collection of 17 datasets related to enhancing renewable energy.
We unify all tasks and datasets, such that they can be solved using a single multi-tasking ML model.
arXiv Detail & Related papers (2023-11-12T15:30:44Z) - Evaluating the Capabilities of Multi-modal Reasoning Models with
Synthetic Task Data [0.0]
We leverage advances in high resolution text-to-image generation to develop a framework for generating evaluation data for multi-modal reasoning tasks.
We apply this framework to generate context-dependent anomaly data, creating a synthetic dataset on a challenging task.
We demonstrate that while the task is tractable, the model performs significantly worse on the context-dependent anomaly detection task than on standard VQA tasks.
arXiv Detail & Related papers (2023-06-01T20:56:34Z) - LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities [66.36633042421387]
Large Language Models (LLMs) for Knowledge Graph (KG) construction and reasoning evaluated.<n>We propose AutoKG, a multi-agent-based approach employing LLMs and external sources for KG construction and reasoning.
arXiv Detail & Related papers (2023-05-22T15:56:44Z) - Active Feature Acquisition with Generative Surrogate Models [11.655069211977464]
In this work, we consider models that perform active feature acquisition (AFA) and query the environment for unobserved features.
Our work reformulates the Markov decision process (MDP) that underlies the AFA problem as a generative modeling task.
We propose learning a generative surrogate model ( GSM) that captures the dependencies among input features to assess potential information gain from acquisitions.
arXiv Detail & Related papers (2020-10-06T02:10:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.