Related papers: Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction

Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction

URL: http://arxiv.org/abs/2601.05266v1
Date: Sat, 08 Nov 2025 14:43:20 GMT
Title: Retrieval-Augmented Multi-LLM Ensemble for Industrial Part Specification Extraction
Authors: Muzakkiruddin Ahmed Mohammed, John R. Talburt, Leon Claasssens, Adriaan Marais,
Abstract summary: This paper introduces a retrieval-augmented multi-LLM ensemble framework that orchestrates nine state-of-the-art Large Language Models (LLMs)<n>RAGsemble addresses key limitations of single-model systems by combining the complementary strengths of model families including Gemini (2.0, 2.5, 1.5), OpenAI (GPT-4o, o4-mini), Mistral Large, and Gemma (1B, 4B, 3n-e4b), while grounding outputs in factual data using FAISS-based semantic retrieval.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Industrial part specification extraction from unstructured text remains a persistent challenge in manufacturing, procurement, and maintenance, where manual processing is both time-consuming and error-prone. This paper introduces a retrieval-augmented multi-LLM ensemble framework that orchestrates nine state-of-the-art Large Language Models (LLMs) within a structured three-phase pipeline. RAGsemble addresses key limitations of single-model systems by combining the complementary strengths of model families including Gemini (2.0, 2.5, 1.5), OpenAI (GPT-4o, o4-mini), Mistral Large, and Gemma (1B, 4B, 3n-e4b), while grounding outputs in factual data using FAISS-based semantic retrieval. The system architecture consists of three stages: (1) parallel extraction by diverse LLMs, (2) targeted research augmentation leveraging high-performing models, and (3) intelligent synthesis with conflict resolution and confidence-aware scoring. RAG integration provides real-time access to structured part databases, enabling the system to validate, refine, and enrich outputs through similarity-based reference retrieval. Experimental results using real industrial datasets demonstrate significant gains in extraction accuracy, technical completeness, and structured output quality compared to leading single-LLM baselines. Key contributions include a scalable ensemble architecture for industrial domains, seamless RAG integration throughout the pipeline, comprehensive quality assessment mechanisms, and a production-ready solution suitable for deployment in knowledge-intensive manufacturing environments.

Related papers

Layout-Aware Parsing Meets Efficient LLMs: A Unified, Scalable Framework for Resume Information Extraction and Evaluation [31.356673356827432]
We present a layout-aware and efficiency-optimized framework for automated extraction and evaluation.<n>Our system is fully deployed in Alibaba's intelligent HR platform, supporting real-time applications across its business units.
arXiv Detail & Related papers (2025-10-10T07:01:35Z)
Automatic Building Code Review: A Case Study [6.530899637501737]
Building officials face labor-intensive, error-prone, and costly manual reviews of design documents as projects increase in size and complexity.<n>This study introduces a novel agent-driven framework that integrates BIM-based data extraction with automated verification.
arXiv Detail & Related papers (2025-10-03T00:30:14Z)
A Tale of Two Experts: Cooperative Learning for Source-Free Unsupervised Domain Adaptation [59.88864205383671]
Source-Free Unsupervised Domain Adaptation (SFUDA) addresses the realistic challenge of adapting a source-trained model to a target domain without access to the source data.<n>Existing SFUDA methods either exploit only the source model's predictions or fine-tune large multimodal models.<n>We propose the Experts Cooperative Learning (EXCL) to exploit complementary insights and the latent structure of target data.
arXiv Detail & Related papers (2025-09-26T11:39:50Z)
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System [61.12400636463362]
OnePiece is a unified framework that seamlessly integrates LLM-style context engineering and reasoning into both retrieval and ranking models.<n>OnePiece has been deployed in the main personalized search scenario of Shopee and achieves consistent online gains across different key business metrics.
arXiv Detail & Related papers (2025-09-22T17:59:07Z)
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers [103.4410890572479]
We introduce the Loong Project: an open-source framework for scalable synthetic data generation and verification.<n>LoongBench is a curated seed dataset containing 8,729 human-vetted examples across 12 domains.<n>LoongEnv is a modular synthetic data generation environment that supports multiple prompting strategies to produce new question-answer-code triples.
arXiv Detail & Related papers (2025-09-03T06:42:40Z)
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use [78.29315418819074]
We introduce VerlTool, a unified and modular framework that addresses limitations through systematic design principles.<n>Our framework formalizes ARLT as multi-turn trajectories with multi-modal observation tokens (text/image/video), extending beyond single-turn RLVR paradigms.<n>The modular plugin architecture enables rapid tool integration requiring only lightweight Python definitions.
arXiv Detail & Related papers (2025-09-01T01:45:18Z)
Leveraging Generative Models for Real-Time Query-Driven Text Summarization in Large-Scale Web Search [54.987957691350665]
Query-Driven Text Summarization (QDTS) aims to generate concise and informative summaries from textual documents based on a given query.<n>Traditional extractive summarization models, based primarily on ranking candidate summary segments, have been the dominant approach in industrial applications.<n>We propose a novel framework to pioneer the application of generative models to address real-time QDTS in industrial web search.
arXiv Detail & Related papers (2025-08-28T08:51:51Z)
Patchwork: A Unified Framework for RAG Serving [6.430565435912026]
Retrieval Augmented Generation (RAG) has emerged as a new paradigm for enhancing Large Language Model reliability through integration with external knowledge sources.<n>We introduce Patchwork, a comprehensive end-to-end RAG serving framework designed to address these efficiency bottlenecks.
arXiv Detail & Related papers (2025-05-01T18:58:26Z)
HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation [11.53083922927901]
HM-RAG is a novel Hierarchical Multi-agent Multimodal RAG framework.<n>It pioneers collaborative intelligence for dynamic knowledge synthesis across structured, unstructured, and graph-based data.
arXiv Detail & Related papers (2025-04-13T06:55:33Z)
CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models [9.419063976761175]
We propose a novel framework leveraging LLMs to optimize construction schedules in complex projects like semiconductor fabrication.<n>ConSTRUCTA addresses key challenges by: (1) integrating construction-specific knowledge through static RAG; (2) employing context-sampling techniques inspired by architectural expertise to provide relevant input; and (3) deploying Construction DPO to align schedules with expert preferences.
arXiv Detail & Related papers (2025-02-17T17:35:42Z)
Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations. Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.