Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections
- URL: http://arxiv.org/abs/2504.11055v1
- Date: Tue, 15 Apr 2025 10:42:25 GMT
- Title: Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections
- Authors: Alireza Salehi, Mohammadreza Salehi, Reshad Hosseini, Cees G. M. Snoek, Makoto Yamada, Mohammad Sabokrou,
- Abstract summary: Anomaly Detection (AD) involves identifying deviations from normal data distributions.<n>We propose a novel approach that conditions the prompts of the text encoder based on image context extracted from the vision encoder.<n>Our method achieves state-of-the-art performance, improving performance by 2% to 29% across different metrics on 14 datasets.
- Score: 50.343419243749054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly Detection (AD) involves identifying deviations from normal data distributions and is critical in fields such as medical diagnostics and industrial defect detection. Traditional AD methods typically require the availability of normal training samples; however, this assumption is not always feasible, as collecting such data can be impractical. Additionally, these methods often struggle to generalize across different domains. Recent advancements, such as AnomalyCLIP and AdaCLIP, utilize the zero-shot generalization capabilities of CLIP but still face a performance gap between image-level and pixel-level anomaly detection. To address this gap, we propose a novel approach that conditions the prompts of the text encoder based on image context extracted from the vision encoder. Also, to capture fine-grained variations more effectively, we have modified the CLIP vision encoder and altered the extraction of dense features. These changes ensure that the features retain richer spatial and structural information for both normal and anomalous prompts. Our method achieves state-of-the-art performance, improving performance by 2% to 29% across different metrics on 14 datasets. This demonstrates its effectiveness in both image-level and pixel-level anomaly detection.
Related papers
- Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time [60.341117019125214]
We propose a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns in graph anomaly detection (GAD)<n>To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level.<n>Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.
arXiv Detail & Related papers (2025-11-10T12:10:05Z) - Learning local and global prototypes with optimal transport for unsupervised anomaly detection and localization [0.28647133890966986]
Unsupervised anomaly detection aims to detect defective parts of a sample by having access, during training, to a set of normal, i.e. defect-free, data.<n>We propose a novel UAD method based on prototype learning and introduce a metric to compare a structured set of embeddings.
arXiv Detail & Related papers (2025-08-18T13:51:36Z) - Zero-Shot Anomaly Detection with Dual-Branch Prompt Selection [17.263625932911534]
Zero-shot anomaly detection (ZSAD) enables identifying and localizing defects in unseen categories.<n>Existing ZSAD methods, whether using fixed or learned prompts, struggle under domain shifts because their training data are derived from limited training domains.<n>We introduce PILOT, a framework designed to overcome these challenges through two key innovations.
arXiv Detail & Related papers (2025-08-01T17:00:12Z) - AF-CLIP: Zero-Shot Anomaly Detection via Anomaly-Focused CLIP Adaptation [8.252046294696585]
We propose AF-CLIP (Anomaly-Focused CLIP) by dramatically enhancing its visual representations to focus on local defects.<n>Our approach introduces a lightweight adapter that emphasizes anomaly-relevant patterns in visual features.<n>Our method is also extended to few-shot scenarios by extra memory banks.
arXiv Detail & Related papers (2025-07-26T13:34:38Z) - Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - MadCLIP: Few-shot Medical Anomaly Detection with CLIP [14.023527193608142]
An innovative few-shot anomaly detection approach is presented, leveraging the pre-trained CLIP model for medical data.<n>A dual-branch design is proposed to separately capture normal and abnormal features through learnable adapters.<n>To improve semantic alignment, learnable text prompts are employed to link visual features.
arXiv Detail & Related papers (2025-06-30T12:56:17Z) - Domain Adaptation of Attention Heads for Zero-shot Anomaly Detection [2.971139973290946]
We propose HeadCLIP to overcome limitations by effectively adapting both text and image encoders to the domain.<n>HeadCLIP generalizes the concepts of normality and abnormality through learnable prompts in the text encoder.<n>We show that HeadCLIP outperforms existing ZSAD techniques at both pixel and image levels.
arXiv Detail & Related papers (2025-05-28T11:45:51Z) - Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift [51.24522135151649]
Anomaly detection plays a crucial role in quality control for industrial applications.<n>Existing methods attempt to address domain shifts by training generalizable models.<n>Our proposed method demonstrates superior results compared with state-of-the-art anomaly detection and domain adaptation methods.
arXiv Detail & Related papers (2025-03-19T05:25:52Z) - Breaking the Bias: Recalibrating the Attention of Industrial Anomaly Detection [20.651257973799527]
Recalibrating Attention of Industrial Anomaly Detection (RAAD) is a framework that systematically decomposes and recalibrates attention maps.<n> HQS dynamically adjusts bit-widths based on the hierarchical nature of attention maps.<n>We validate the effectiveness of RAAD on 32 datasets using a single 3090ti.
arXiv Detail & Related papers (2024-12-11T08:31:47Z) - Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection [88.34095233600719]
FAPrompt is a novel framework designed to learn Fine-grained Abnormality Prompts for more accurate ZSAD.
It substantially outperforms state-of-the-art methods by at least 3%-5% AUC/AP in both image- and pixel-level ZSAD tasks.
arXiv Detail & Related papers (2024-10-14T08:41:31Z) - AnoPLe: Few-Shot Anomaly Detection via Bi-directional Prompt Learning with Only Normal Samples [6.260747047974035]
AnoPLe is a multi-modal prompt learning method designed for anomaly detection without prior knowledge of anomalies.
The experimental results demonstrate that AnoPLe achieves strong FAD performance, recording 94.1% and 86.2% Image AUROC on MVTec-AD and VisA respectively.
arXiv Detail & Related papers (2024-08-24T08:41:19Z) - GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - Global-Regularized Neighborhood Regression for Efficient Zero-Shot Texture Anomaly Detection [7.2414696513135155]
This paper introduces a novel zero-shot texture anomaly detection method named Global-Regularized Neighborhood Regression (GRNR)
GRNR can detect anomalies on arbitrary textured surfaces without any training data or cost.
We validate the effectiveness of GRNR across various industrial scenarios using eight benchmark datasets.
arXiv Detail & Related papers (2024-06-11T15:02:16Z) - Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system.
Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context.
Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Shifting Focus: From Global Semantics to Local Prominent Features in Swin-Transformer for Knee Osteoarthritis Severity Assessment [42.09313885494969]
We harness the Swin Transformer's capacity to discern extended spatial dependencies within images through the hierarchical framework.
Our novel contribution lies in refining local feature representations, orienting them specifically toward the final distribution of the classifier.
Our model demonstrates significant robustness and precision, as evidenced by extensive validation of two established benchmarks for Knee OsteoArthritis (KOA) grade classification.
arXiv Detail & Related papers (2024-03-15T01:09:58Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection [30.679012320439625]
AnomalyCLIP learns object-agnostic text prompts to capture generic normality and abnormality in an image.
It achieves superior zero-shot performance of detecting and segmenting anomalies in datasets of highly diverse class semantics.
arXiv Detail & Related papers (2023-10-29T10:03:49Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - AnoViT: Unsupervised Anomaly Detection and Localization with Vision
Transformer-based Encoder-Decoder [3.31490164885582]
We propose a vision transformer-based encoder-decoder model, named AnoViT, to reflect normal information by additionally learning the global relationship between image patches.
The proposed model performed better than the convolution-based model on three benchmark datasets.
arXiv Detail & Related papers (2022-03-21T09:01:37Z) - Contrastive Predictive Coding for Anomaly Detection [0.0]
Contrastive Predictive Coding model (arXiv:1807.03748) used for anomaly detection and segmentation.
We show that its patch-wise contrastive loss can directly be interpreted as an anomaly score.
Model achieves promising results for both anomaly detection and segmentation on the MVTec-AD dataset.
arXiv Detail & Related papers (2021-07-16T11:04:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.