AnoRefiner: Anomaly-Aware Group-Wise Refinement for Zero-Shot Industrial Anomaly Detection
- URL: http://arxiv.org/abs/2511.22595v1
- Date: Thu, 27 Nov 2025 16:25:05 GMT
- Title: AnoRefiner: Anomaly-Aware Group-Wise Refinement for Zero-Shot Industrial Anomaly Detection
- Authors: Dayou Huang, Feng Xue, Xurui Li, Yu Zhou,
- Abstract summary: An anomaly-aware refiner (AnoRefiner) can be plugged into most ZSAD models and improve patch-level anomaly maps to the pixel level.<n>First, we design an anomaly refinement decoder (ARD) that progressively enhances image features using anomaly score maps.<n>Second, motivated by the mass production paradigm, we propose a progressive group-wise test-time training (PGT) strategy.
- Score: 7.619373121202244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot industrial anomaly detection (ZSAD) methods typically yield coarse anomaly maps as vision transformers (ViTs) extract patch-level features only. To solve this, recent solutions attempt to predict finer anomalies using features from ZSAD, but they still struggle to recover fine-grained anomalies without missed detections, mainly due to the gap between randomly synthesized training anomalies and real ones. We observe that anomaly score maps exactly provide complementary spatial cues that are largely absent from ZSAD's image features, a fact overlooked before. Inspired by this, we propose an anomaly-aware refiner (AnoRefiner) that can be plugged into most ZSAD models and improve patch-level anomaly maps to the pixel level. First, we design an anomaly refinement decoder (ARD) that progressively enhances image features using anomaly score maps, reducing the reliance on synthetic anomaly data. Second, motivated by the mass production paradigm, we propose a progressive group-wise test-time training (PGT) strategy that trains ARD in each product group for the refinement process in the next group, while staying compatible with any ZSAD method. Experiments on the MVTec AD and VisA datasets show that AnoRefiner boosts various ZSAD models by up to a 5.2\% gain in pixel-AP metrics, which can also be directly observed in many visualizations. The code will be available at https://github.com/HUST-SLOW/AnoRefiner.
Related papers
- Steering and Rectifying Latent Representation Manifolds in Frozen Multi-modal LLMs for Video Anomaly Detection [52.5174167737992]
Video anomaly detection (VAD) aims to identify abnormal events in videos.<n>We propose SteerVAD, which advances MLLM-based VAD by shifting from passively reading to actively steering and rectifying internal representations.<n>Our method achieves state-of-the-art performance among tuning-free approaches requiring only 1% of training data.
arXiv Detail & Related papers (2026-02-27T13:48:50Z) - Unified Unsupervised Anomaly Detection via Matching Cost Filtering [113.43366521994396]
Unsupervised anomaly detection (UAD) aims to identify image- and pixel-level anomalies using only normal training data.<n>We present Unified Cost Filtering (UCF), a generic post-hoc refinement framework for refining anomaly cost volume of any UAD model.
arXiv Detail & Related papers (2025-10-03T03:28:18Z) - Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors [58.75916798814376]
We develop a few-shot anomaly detector termed FoundAD.<n>We observe that the anomaly amount in an image directly correlates with the difference in the learnt embeddings.<n>The simple operator acts as an effective tool for anomaly detection to characterize and identify out-of-distribution regions in an image.
arXiv Detail & Related papers (2025-10-02T11:53:20Z) - AD-DINOv3: Enhancing DINOv3 for Zero-Shot Anomaly Detection with Anomaly-Aware Calibration [12.642531824086639]
Zero-Shot Anomaly Detection (ZSAD) seeks to identify anomalies from arbitrary novel categories.<n>Recent vision foundation models such as DINOv3 have demonstrated strong transferable representation capabilities.<n>We introduce AD-DINOv3, a novel vision-language multimodal framework designed for ZSAD.
arXiv Detail & Related papers (2025-09-17T15:29:25Z) - Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation [38.76264181764036]
Anomaly detection is a practical and challenging task due to the scarcity of anomaly samples in industrial inspection.<n>We propose a few-shot Anomaly-driven Generation (AnoGen) method, which guides the diffusion model to generate realistic and diverse anomalies.<n>Our method builds upon DRAEM and DesTSeg as the foundation model and conducts experiments on the commonly used industrial anomaly detection dataset, MVTec.
arXiv Detail & Related papers (2025-05-14T10:25:06Z) - Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection [109.72772150095646]
FAPrompt is a novel framework designed to learn Fine-grained Abnormality Prompts for accurate ZSAD.<n>Experiments on 19 real-world datasets, covering both industrial defects and medical anomalies, demonstrate that FAPrompt substantially outperforms state-of-the-art methods in both image- and pixel-level ZSAD tasks.
arXiv Detail & Related papers (2024-10-14T08:41:31Z) - View-Invariant Pixelwise Anomaly Detection in Multi-object Scenes with Adaptive View Synthesis [0.0]
We introduce and formalize Scene Anomaly Detection (Scene AD) as the task of unsupervised, pixel-wise anomaly localization.<n>We evaluate progress in Scene AD using ToyCity, the first multi-object, multi-view real-image dataset.<n>Our experiments demonstrate that OmniAD, when used with augmented views, yields a 64.33% increase in pixel-wise (F_1) score over Reverse Distillation with no augmentation.
arXiv Detail & Related papers (2024-06-26T01:54:10Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Unsupervised Visual Defect Detection with Score-Based Generative Model [17.610722842950555]
We focus on the unsupervised visual defect detection and localization tasks.
We propose a novel framework based on the recent score-based generative models.
We evaluate our method on several datasets to demonstrate its effectiveness.
arXiv Detail & Related papers (2022-11-29T11:06:29Z) - ADTR: Anomaly Detection Transformer with Feature Reconstruction [40.68590890351697]
Anomaly detection with only prior knowledge from normal samples attracts more attention.
Existing CNN-based pixel reconstruction approaches suffer from two concerns.
We propose Anomaly Detection TRansformer (ADTR) to apply a transformer to reconstruct pre-trained features.
arXiv Detail & Related papers (2022-09-05T08:01:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.