Related papers: EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models

EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models

URL: http://arxiv.org/abs/2503.14162v1
Date: Tue, 18 Mar 2025 11:33:29 GMT
Title: EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models
Authors: Zongyun Zhang, Jiacheng Ruan, Xian Gao, Ting Liu, Yuzhuo Fu,
Abstract summary: Industrial Anomaly Detection (IAD) is critical to ensure product quality during manufacturing.<n>We propose a novel approach that introduces a dedicated multi-modal defect localization module to decouple the dialog functionality from the core feature extraction.<n>We also contribute to the first multi-modal industrial anomaly detection training dataset, named Defect Detection Question Answering (DDQA)
Score: 23.898938659720503
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Industrial Anomaly Detection (IAD) is critical to ensure product quality during manufacturing. Although existing zero-shot defect segmentation and detection methods have shown effectiveness, they cannot provide detailed descriptions of the defects. Furthermore, the application of large multi-modal models in IAD remains in its infancy, facing challenges in balancing question-answering (QA) performance and mask-based grounding capabilities, often owing to overfitting during the fine-tuning process. To address these challenges, we propose a novel approach that introduces a dedicated multi-modal defect localization module to decouple the dialog functionality from the core feature extraction. This decoupling is achieved through independent optimization objectives and tailored learning strategies. Additionally, we contribute to the first multi-modal industrial anomaly detection training dataset, named Defect Detection Question Answering (DDQA), encompassing a wide range of defect types and industrial scenarios. Unlike conventional datasets that rely on GPT-generated data, DDQA ensures authenticity and reliability and offers a robust foundation for model training. Experimental results demonstrate that our proposed method, Explainable Industrial Anomaly Detection Assistant (EIAD), achieves outstanding performance in defect detection and localization tasks. It not only significantly enhances accuracy but also improves interpretability. These advancements highlight the potential of EIAD for practical applications in industrial settings.

Related papers

Region-Aware CAM: High-Resolution Weakly-Supervised Defect Segmentation via Salient Region Perception [2.9962030276180758]
This paper proposes a novel weakly supervised semantic segmentation framework.<n>It consists of a region-aware class activation map (CAM) and pseudo-label training.<n>The proposed framework effectively bridges the gap between weakly supervised learning and high-precision defect segmentation.
arXiv Detail & Related papers (2025-06-28T12:24:45Z)
Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z)
Offline Model-Based Optimization: Comprehensive Review [61.91350077539443]
offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. Recent advances in model-based optimization have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review.
arXiv Detail & Related papers (2025-03-21T16:35:02Z)
Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift [51.24522135151649]
Anomaly detection plays a crucial role in quality control for industrial applications. Existing methods attempt to address domain shifts by training generalizable models. Our proposed method demonstrates superior results compared with state-of-the-art anomaly detection and domain adaptation methods.
arXiv Detail & Related papers (2025-03-19T05:25:52Z)
Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process [67.99194145865165]
We modify the AnyRes structure of the LLaVA model to provide the potential anomalous areas identified by existing IAD models to the LMMs.<n>Considering that the generation of defects is closely related to the manufacturing process, we propose a manufacturing-driven IAD paradigm.<n>We present Triad, a novel LMM-based method incorporating an expert-guided region-of-interest tokenizer and manufacturing process.
arXiv Detail & Related papers (2025-03-17T13:56:57Z)
ISP-AD: A Large-Scale Real-World Dataset for Advancing Industrial Anomaly Detection with Synthetic and Real Defects [0.0]
Industrial Screen Printing Anomaly Detection dataset (ISP-AD)<n>ISP-AD is the largest publicly available industrial dataset to date, including both synthetic and real defects collected directly from the factory floor.<n>Experiments on a mixed supervised training approach, incorporating both synthesized and real defects, were conducted.<n>Research findings indicate that supervision by means of both synthetic and accumulated real defects can complement each other, meeting demanded industrial inspection requirements such as low false positive rates and high recall.
arXiv Detail & Related papers (2025-03-06T21:56:31Z)
Can Multimodal Large Language Models be Guided to Improve Industrial Anomaly Detection? [5.979778557940213]
Traditional industrial anomaly detection models often struggle with flexibility and adaptability.<n>Recent advancements in Multimodal Large Language Models (MLLMs) hold promise for overcoming these limitations.<n>We propose Echo, a novel multi-expert framework designed to enhance MLLM performance for IAD.
arXiv Detail & Related papers (2025-01-27T05:41:10Z)
Exploring Large Vision-Language Models for Robust and Efficient Industrial Anomaly Detection [4.691083532629246]
We propose Vision-Language Anomaly Detection via Contrastive Cross-Modal Training (CLAD)<n> CLAD aligns visual and textual features into a shared embedding space using contrastive learning.<n>We demonstrate that CLAD outperforms state-of-the-art methods in both image-level anomaly detection and pixel-level anomaly localization.
arXiv Detail & Related papers (2024-12-01T17:00:43Z)
AAD-LLM: Adaptive Anomaly Detection Using Large Language Models [35.286105732902065]
The research aims to improve the transferability of anomaly detection models by leveraging Large Language Models (LLMs) The research also seeks to enable more collaborative decision-making between the model and plant operators.
arXiv Detail & Related papers (2024-11-01T13:43:28Z)
RADAR: Robust Two-stage Modality-incomplete Industrial Anomaly Detection [61.71770293720491]
We propose a novel two-stage Robust modAlity-imcomplete fusing and Detecting frAmewoRk, abbreviated as RADAR. Our bootstrapping philosophy is to enhance two stages in MIIAD, improving the robustness of the Multimodal Transformer. Our experimental results demonstrate that the proposed RADAR significantly surpasses conventional MIAD methods in terms of effectiveness and robustness.
arXiv Detail & Related papers (2024-10-02T16:47:55Z)
Incomplete Multimodal Industrial Anomaly Detection via Cross-Modal Distillation [0.0]
multimodal industrial anomaly detection (IAD) based on 3D point clouds and RGB images remains a work in progress. Existing quality control processes combine rapid in-line inspections, such as optical and infrared imaging with high-resolution but time-consuming near-line characterization techniques. We propose CMDIAD, a Cross-Modal Distillation framework for IAD to demonstrate the feasibility of a Multi-modal Training, Few-modal Inference pipeline.
arXiv Detail & Related papers (2024-05-22T12:08:56Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection [86.24898024621008]
We present a novel large multimodal model applying vision experts for industrial anomaly detection(abbreviated to Myriad)<n>We utilize the anomaly map generated by the vision experts as guidance for LMMs, such that the vision model is guided to pay more attention to anomalous regions.<n>Our proposed method not only performs favorably against state-of-the-art methods, but also inherits the flexibility and instruction-following ability of LMMs in the field of IAD.
arXiv Detail & Related papers (2023-10-29T16:49:45Z)
Anomaly Detection Based on Selection and Weighting in Latent Space [73.01328671569759]
We propose a novel selection-and-weighting-based anomaly detection framework called SWAD. Experiments on both benchmark and real-world datasets have shown the effectiveness and superiority of SWAD.
arXiv Detail & Related papers (2021-03-08T10:56:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.