Related papers: AnomalyMoE: Towards a Language-free Generalist Model for Unified Visual Anomaly Detection

AnomalyMoE: Towards a Language-free Generalist Model for Unified Visual Anomaly Detection

URL: http://arxiv.org/abs/2508.06203v1
Date: Fri, 08 Aug 2025 10:33:18 GMT
Title: AnomalyMoE: Towards a Language-free Generalist Model for Unified Visual Anomaly Detection
Authors: Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Wei Ge, Ming Tang, Jinqiao Wang,
Abstract summary: AnomalyMoE is a novel and universal anomaly detection framework based on a Mixture-of-Experts architecture.<n>Our key insight is to decompose the complex anomaly detection problem into three distinct semantic hierarchies.<n>AnomalyMoE employs three dedicated expert networks at the patch, component, and global levels.
Score: 29.06542941993374
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Anomaly detection is a critical task across numerous domains and modalities, yet existing methods are often highly specialized, limiting their generalizability. These specialized models, tailored for specific anomaly types like textural defects or logical errors, typically exhibit limited performance when deployed outside their designated contexts. To overcome this limitation, we propose AnomalyMoE, a novel and universal anomaly detection framework based on a Mixture-of-Experts (MoE) architecture. Our key insight is to decompose the complex anomaly detection problem into three distinct semantic hierarchies: local structural anomalies, component-level semantic anomalies, and global logical anomalies. AnomalyMoE correspondingly employs three dedicated expert networks at the patch, component, and global levels, and is specialized in reconstructing features and identifying deviations at its designated semantic level. This hierarchical design allows a single model to concurrently understand and detect a wide spectrum of anomalies. Furthermore, we introduce an Expert Information Repulsion (EIR) module to promote expert diversity and an Expert Selection Balancing (ESB) module to ensure the comprehensive utilization of all experts. Experiments on 8 challenging datasets spanning industrial imaging, 3D point clouds, medical imaging, video surveillance, and logical anomaly detection demonstrate that AnomalyMoE establishes new state-of-the-art performance, significantly outperforming specialized methods in their respective domains.

Related papers

MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection [6.6626674107399495]
MoECLIP is a Mixture-of-Experts architecture for the Zero-Shot Anomaly Detection (ZSAD) task.<n>It achieves patch-level adaptation by dynamically routing each image patch to a specialized Low-Rank Adaptation (LoRA) expert based on its unique characteristics.
arXiv Detail & Related papers (2026-03-03T15:36:55Z)
Learning Discriminative and Generalizable Anomaly Detector for Dynamic Graph with Limited Supervision [31.57563937222115]
Dynamic graph anomaly detection (DGAD) is critical for many real-world applications but remains challenging due to the scarcity of labeled anomalies.<n>We propose an effective, generalizable, and model-agnostic framework with three main components.
arXiv Detail & Related papers (2026-02-23T16:25:35Z)
MAU-GPT: Enhancing Multi-type Industrial Anomaly Understanding via Anomaly-aware and Generalist Experts Adaptation [31.60185302007424]
We introduce MAU-Set, a comprehensive dataset for Multi-type industrial Anomaly Understanding.<n>We then present MAU-GPT, a domain-adapted multimodal large model specifically designed for industrial anomaly understanding.<n>It incorporates a novel AMoE-LoRA mechanism that unifies anomaly-aware and generalist experts adaptation, enhancing both understanding and reasoning across diverse defect classes.
arXiv Detail & Related papers (2026-01-31T05:36:49Z)
CASL: Curvature-Augmented Self-supervised Learning for 3D Anomaly Detection [49.74534277563012]
We propose a Curvature-Augmented Self-supervised Learning (CASL) framework based on a reconstruction paradigm.<n>Our approach introduces multi-scale curvature prompts to guide the decoder in predicting the spatial coordinates of each point.<n>It achieves leading detection performance through straightforward anomaly classification fine-tuning.
arXiv Detail & Related papers (2025-11-17T02:58:09Z)
CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [49.11819337853632]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
Towards Training-free Anomaly Detection with Vision and Language Foundation Models [17.991678161890174]
Anomaly detection is valuable for real-world applications, such as industrial quality inspection.<n>We introduce LogSAD, a novel multi-modal framework that requires no training for both Logical and Structural Anomaly Detection.
arXiv Detail & Related papers (2025-03-24T04:07:59Z)
ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.<n> equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.<n>Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z)
Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection [59.34318192698142]
We introduce a prior-less anomaly generation paradigm and develop an innovative unsupervised anomaly detection framework named GRAD. PatchDiff effectively expose various types of anomaly patterns. experiments on both MVTec AD and MVTec LOCO datasets also support the aforementioned observation.
arXiv Detail & Related papers (2023-12-26T07:08:06Z)
Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal. Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos. This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z)
Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection [86.24898024621008]
We present a novel large multimodal model applying vision experts for industrial anomaly detection(abbreviated to Myriad)<n>We utilize the anomaly map generated by the vision experts as guidance for LMMs, such that the vision model is guided to pay more attention to anomalous regions.<n>Our proposed method not only performs favorably against state-of-the-art methods, but also inherits the flexibility and instruction-following ability of LMMs in the field of IAD.
arXiv Detail & Related papers (2023-10-29T16:49:45Z)
Learning Global-Local Correspondence with Semantic Bottleneck for Logical Anomaly Detection [6.553276620691242]
This paper presents a novel framework, named Global-Local Correspondence Framework (GLCF), for visual anomaly detection with logical constraints. Visual anomaly detection has become an active research area in various real-world applications, such as industrial anomaly detection and medical disease diagnosis.
arXiv Detail & Related papers (2023-03-10T08:09:40Z)
Prototypical Residual Networks for Anomaly Detection and Localization [80.5730594002466]
We propose a framework called Prototypical Residual Network (PRN) PRN learns feature residuals of varying scales and sizes between anomalous and normal patterns to accurately reconstruct the segmentation maps of anomalous regions. We present a variety of anomaly generation strategies that consider both seen and unseen appearance variance to enlarge and diversify anomalies.
arXiv Detail & Related papers (2022-12-05T05:03:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.