Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction
- URL: http://arxiv.org/abs/2601.05266v1
- Date: Sat, 08 Nov 2025 14:43:20 GMT
- Title: Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction
- Authors: Muzakkiruddin Ahmed Mohammed, John R. Talburt, Leon Claasssens, Adriaan Marais,
- Abstract summary: This paper introduces a retrieval-augmented multi-LLM ensemble framework that orchestrates nine state-of-the-art Large Language Models (LLMs)<n>RAGsemble addresses key limitations of single-model systems by combining the complementary strengths of model families including Gemini (2.0, 2.5, 1.5), OpenAI (GPT-4o, o4-mini), Mistral Large, and Gemma (1B, 4B, 3n-e4b), while grounding outputs in factual data using FAISS-based semantic retrieval.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Industrial part specification extraction from unstructured text remains a persistent challenge in manufacturing, procurement, and maintenance, where manual processing is both time-consuming and error-prone. This paper introduces a retrieval-augmented multi-LLM ensemble framework that orchestrates nine state-of-the-art Large Language Models (LLMs) within a structured three-phase pipeline. RAGsemble addresses key limitations of single-model systems by combining the complementary strengths of model families including Gemini (2.0, 2.5, 1.5), OpenAI (GPT-4o, o4-mini), Mistral Large, and Gemma (1B, 4B, 3n-e4b), while grounding outputs in factual data using FAISS-based semantic retrieval. The system architecture consists of three stages: (1) parallel extraction by diverse LLMs, (2) targeted research augmentation leveraging high-performing models, and (3) intelligent synthesis with conflict resolution and confidence-aware scoring. RAG integration provides real-time access to structured part databases, enabling the system to validate, refine, and enrich outputs through similarity-based reference retrieval. Experimental results using real industrial datasets demonstrate significant gains in extraction accuracy, technical completeness, and structured output quality compared to leading single-LLM baselines. Key contributions include a scalable ensemble architecture for industrial domains, seamless RAG integration throughout the pipeline, comprehensive quality assessment mechanisms, and a production-ready solution suitable for deployment in knowledge-intensive manufacturing environments.
Related papers
- Layout-Aware Parsing Meets Efficient LLMs: A Unified, Scalable Framework for Resume Information Extraction and Evaluation [31.356673356827432]
We present a layout-aware and efficiency-optimized framework for automated extraction and evaluation.<n>Our system is fully deployed in Alibaba's intelligent HR platform, supporting real-time applications across its business units.
arXiv Detail & Related papers (2025-10-10T07:01:35Z) - Automatic Building Code Review: A Case Study [6.530899637501737]
Building officials face labor-intensive, error-prone, and costly manual reviews of design documents as projects increase in size and complexity.<n>This study introduces a novel agent-driven framework that integrates BIM-based data extraction with automated verification.
arXiv Detail & Related papers (2025-10-03T00:30:14Z) - A Tale of Two Experts: Cooperative Learning for Source-Free Unsupervised Domain Adaptation [59.88864205383671]
Source-Free Unsupervised Domain Adaptation (SFUDA) addresses the realistic challenge of adapting a source-trained model to a target domain without access to the source data.<n>Existing SFUDA methods either exploit only the source model's predictions or fine-tune large multimodal models.<n>We propose the Experts Cooperative Learning (EXCL) to exploit complementary insights and the latent structure of target data.
arXiv Detail & Related papers (2025-09-26T11:39:50Z) - OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System [61.12400636463362]
OnePiece is a unified framework that seamlessly integrates LLM-style context engineering and reasoning into both retrieval and ranking models.<n>OnePiece has been deployed in the main personalized search scenario of Shopee and achieves consistent online gains across different key business metrics.
arXiv Detail & Related papers (2025-09-22T17:59:07Z) - Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers [103.4410890572479]
We introduce the Loong Project: an open-source framework for scalable synthetic data generation and verification.<n>LoongBench is a curated seed dataset containing 8,729 human-vetted examples across 12 domains.<n>LoongEnv is a modular synthetic data generation environment that supports multiple prompting strategies to produce new question-answer-code triples.
arXiv Detail & Related papers (2025-09-03T06:42:40Z) - VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use [78.29315418819074]
We introduce VerlTool, a unified and modular framework that addresses limitations through systematic design principles.<n>Our framework formalizes ARLT as multi-turn trajectories with multi-modal observation tokens (text/image/video), extending beyond single-turn RLVR paradigms.<n>The modular plugin architecture enables rapid tool integration requiring only lightweight Python definitions.
arXiv Detail & Related papers (2025-09-01T01:45:18Z) - Leveraging Generative Models for Real-Time Query-Driven Text Summarization in Large-Scale Web Search [54.987957691350665]
Query-Driven Text Summarization (QDTS) aims to generate concise and informative summaries from textual documents based on a given query.<n>Traditional extractive summarization models, based primarily on ranking candidate summary segments, have been the dominant approach in industrial applications.<n>We propose a novel framework to pioneer the application of generative models to address real-time QDTS in industrial web search.
arXiv Detail & Related papers (2025-08-28T08:51:51Z) - Patchwork: A Unified Framework for RAG Serving [6.430565435912026]
Retrieval Augmented Generation (RAG) has emerged as a new paradigm for enhancing Large Language Model reliability through integration with external knowledge sources.<n>We introduce Patchwork, a comprehensive end-to-end RAG serving framework designed to address these efficiency bottlenecks.
arXiv Detail & Related papers (2025-05-01T18:58:26Z) - HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation [11.53083922927901]
HM-RAG is a novel Hierarchical Multi-agent Multimodal RAG framework.<n>It pioneers collaborative intelligence for dynamic knowledge synthesis across structured, unstructured, and graph-based data.
arXiv Detail & Related papers (2025-04-13T06:55:33Z) - CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models [9.419063976761175]
We propose a novel framework leveraging LLMs to optimize construction schedules in complex projects like semiconductor fabrication.<n>ConSTRUCTA addresses key challenges by: (1) integrating construction-specific knowledge through static RAG; (2) employing context-sampling techniques inspired by architectural expertise to provide relevant input; and (3) deploying Construction DPO to align schedules with expert preferences.
arXiv Detail & Related papers (2025-02-17T17:35:42Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.