Related papers: A SAM-guided Two-stream Lightweight Model for Anomaly Detection

A SAM-guided Two-stream Lightweight Model for Anomaly Detection

URL: http://arxiv.org/abs/2402.19145v2
Date: Tue, 19 Nov 2024 15:54:14 GMT
Title: A SAM-guided Two-stream Lightweight Model for Anomaly Detection
Authors: Chenghao Li, Lei Qi, Xin Geng,
Abstract summary: We propose a SAM-guided Two-stream Lightweight Model for unsupervised anomaly detection (STLM) Our experiments conducted on MVTec AD benchmark show that STLM, with about 16M parameters and achieving an inference time in 20ms, competes effectively with state-of-the-art methods.
Score: 44.73985145110819
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In industrial anomaly detection, model efficiency and mobile-friendliness become the primary concerns in real-world applications. Simultaneously, the impressive generalization capabilities of Segment Anything (SAM) have garnered broad academic attention, making it an ideal choice for localizing unseen anomalies and diverse real-world patterns. In this paper, considering these two critical factors, we propose a SAM-guided Two-stream Lightweight Model for unsupervised anomaly detection (STLM) that not only aligns with the two practical application requirements but also harnesses the robust generalization capabilities of SAM. We employ two lightweight image encoders, i.e., our two-stream lightweight module, guided by SAM's knowledge. To be specific, one stream is trained to generate discriminative and general feature representations in both normal and anomalous regions, while the other stream reconstructs the same images without anomalies, which effectively enhances the differentiation of two-stream representations when facing anomalous regions. Furthermore, we employ a shared mask decoder and a feature aggregation module to generate anomaly maps. Our experiments conducted on MVTec AD benchmark show that STLM, with about 16M parameters and achieving an inference time in 20ms, competes effectively with state-of-the-art methods in terms of performance, 98.26% on pixel-level AUC and 94.92% on PRO. We further experiment on more difficult datasets, e.g., VisA and DAGM, to demonstrate the effectiveness and generalizability of STLM.

Related papers

Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z)
CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection [12.964308630328688]
Infrared small target detection (ISTD) is vital for long-range surveillance in military, maritime, and early warning applications.<n>ISTD is challenged by targets occupying less than 0.15% of the image and low distinguishability from complex backgrounds.<n>This paper presents SAMamba, a novel framework integrating SAM2's hierarchical feature learning with Mamba's selective sequence modeling.
arXiv Detail & Related papers (2025-05-29T07:55:23Z)
SuperAD: A Training-free Anomaly Classification and Segmentation Method for CVPR 2025 VAND 3.0 Workshop Challenge Track 1: Adapt & Detect [17.160007050126403]
We propose a fully training-free anomaly detection and segmentation method based on feature extraction using the DINOv2 model named SuperAD.<n>Our method achieves competitive results on both test sets of the MVTec AD 2 dataset.
arXiv Detail & Related papers (2025-05-26T09:29:27Z)
Learning Multi-view Multi-class Anomaly Detection [10.199404082194947]
We introduce a Multi-View Multi-Class Anomaly Detection model (MVMCAD), which integrates information from multiple views to accurately identify anomalies. Specifically, we propose a semi-frozen encoder, where a pre-encoder prior enhancement mechanism is added before the frozen encoder. An Anomaly Amplification Module (AAM) that models global token interactions and suppresses normal regions, and a Cross-Feature Loss that aligns shallow encoder features with deep decoder features.
arXiv Detail & Related papers (2025-04-30T03:59:58Z)
Real-Time Anomaly Detection with Synthetic Anomaly Monitoring (SAM) [2.055524866851853]
Anomaly detection is essential for identifying rare and significant events across diverse domains such as finance, cybersecurity, and network monitoring. This paper presents Synthetic Anomaly Monitoring (SAM), an innovative approach that applies synthetic control methods from causal inference to improve the accuracy and interpretability of anomaly detection processes.
arXiv Detail & Related papers (2025-01-30T15:15:17Z)
PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model [76.95536611263356]
PolSAR data presents unique challenges due to its rich and complex characteristics. Existing data representations, such as complex-valued data, polarimetric features, and amplitude images, are widely used. Most feature extraction networks for PolSAR are small, limiting their ability to capture features effectively. We propose the Polarimetric Scattering Mechanism-Informed SAM (PolSAM), an enhanced Segment Anything Model (SAM) that integrates domain-specific scattering characteristics and a novel prompt generation strategy.
arXiv Detail & Related papers (2024-12-17T09:59:53Z)
Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning [63.55145330447408]
Segment Anything Model (SAM) has made great progress in anomaly segmentation tasks due to its impressive generalization ability. Existing methods that directly apply SAM through prompting often overlook the domain shift issue. We propose a novel Self-Perceptinon Tuning (SPT) method, aiming to enhance SAM's perception capability for anomaly segmentation.
arXiv Detail & Related papers (2024-11-26T08:33:25Z)
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery [54.866490321241905]
Model merging-based multitask learning (MTL) offers a promising approach for performing MTL by merging multiple expert models. In this paper, we examine the merged model's representation distribution and uncover a critical issue of "representation bias" This bias arises from a significant distribution gap between the representations of the merged and expert models, leading to the suboptimal performance of the merged MTL model.
arXiv Detail & Related papers (2024-10-18T11:49:40Z)
Adapt CLIP as Aggregation Instructor for Image Dehazing [17.29370328189668]
Most dehazing methods suffer from limited receptive field and do not explore the rich semantic prior encapsulated in vision-language models. We introduce CLIPHaze, a pioneering hybrid framework that synergizes the efficient global modeling of Mamba with the prior knowledge and zero-shot capabilities of CLIP. Our method employs parallel state space model and window-based self-attention to obtain global contextual dependency and local fine-grained perception.
arXiv Detail & Related papers (2024-08-22T11:51:50Z)
SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention [0.0]
The Segment Anything Model (SAM) has gained notable recognition for its exceptional performance in image segmentation. Camouflaged objects typically blend into the background, making them difficult to distinguish in still images. We propose a new method called the SAM Spider Module (SAM-PM) to overcome these challenges. Our method effectively incorporates temporal consistency and domain-specific expertise into the segmentation network with an addition of less than 1% of SAM's parameters.
arXiv Detail & Related papers (2024-06-09T14:33:38Z)
DMAD: Dual Memory Bank for Real-World Anomaly Detection [90.97573828481832]
We propose a new framework named Dual Memory bank enhanced representation learning for Anomaly Detection (DMAD) DMAD employs a dual memory bank to calculate feature distance and feature attention between normal and abnormal patterns. We evaluate DMAD on the MVTec-AD and VisA datasets.
arXiv Detail & Related papers (2024-03-19T02:16:32Z)
WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images [8.179859593451285]
We present WSI-SAM, enhancing Segment Anything Model (SAM) with precise object segmentation capabilities for histopathology images. To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters. Our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task.
arXiv Detail & Related papers (2024-03-14T10:30:43Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection [15.991784541576788]
Existing approaches, both video and segment-level label oriented, mainly focus on extracting representations for anomaly data. We propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data. Our method outperforms the state-of-the-art methods by a sizable margin.
arXiv Detail & Related papers (2023-02-10T10:39:40Z)
Prototypical Residual Networks for Anomaly Detection and Localization [80.5730594002466]
We propose a framework called Prototypical Residual Network (PRN) PRN learns feature residuals of varying scales and sizes between anomalous and normal patterns to accurately reconstruct the segmentation maps of anomalous regions. We present a variety of anomaly generation strategies that consider both seen and unseen appearance variance to enlarge and diversify anomalies.
arXiv Detail & Related papers (2022-12-05T05:03:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.