Related papers: MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection

MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection

URL: http://arxiv.org/abs/2504.12970v1
Date: Thu, 17 Apr 2025 14:22:27 GMT
Title: MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection
Authors: Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang,
Abstract summary: Anomaly detection is a crucial task in computer vision, yet collecting real-world defect images is inherently difficult.<n>We introduce a novel pipeline that generates synthetic anomalies through Math-Physics model guidance.<n>By incorporating physical modeling of cracks, corrosion, and deformation, our method produces realistic defect masks.
Score: 30.77558600436759
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Anomaly detection is a crucial task in computer vision, yet collecting real-world defect images is inherently difficult due to the rarity and unpredictability of anomalies. Consequently, researchers have turned to synthetic methods for training data augmentation. However, existing synthetic strategies (e.g., naive cut-and-paste or inpainting) overlook the underlying physical causes of defects, leading to inconsistent, low-fidelity anomalies that hamper model generalization to real-world complexities. In this thesis, we introduced a novel pipeline that generates synthetic anomalies through Math-Physics model guidance, refines them via a Coarse-to-Fine approach and employs a bi-level optimization strategy with a Synthesis Quality Estimator(SQE). By incorporating physical modeling of cracks, corrosion, and deformation, our method produces realistic defect masks, which are subsequently enhanced in two phases. The first stage (npcF) enforces a PDE-based consistency to achieve a globally coherent anomaly structure, while the second stage (npcF++) further improves local fidelity using wavelet transforms and boundary synergy blocks. Additionally, we leverage SQE-driven weighting, ensuring that high-quality synthetic samples receive greater emphasis during training. To validate our approach, we conducted comprehensive experiments on three widely adopted industrial anomaly detection benchmarks: MVTec AD, VisA, and BTAD. Across these datasets, the proposed pipeline achieves state-of-the-art (SOTA) results in both image-AUROC and pixel-AUROC, confirming the effectiveness of our MaPhC2F and BiSQAD.

Related papers

Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection [30.77558600436759]
ARAS is a language-conditioned, auto-regressive anomaly synthesis approach.<n>It injects local, text-specified defects into normal images via token-anchored latent editing.<n>It significantly enhances defect realism, preserves fine-grained material textures, and provides continuous semantic control over synthesized anomalies.
arXiv Detail & Related papers (2025-08-05T15:07:32Z)
Enhancing Generalization in Data-free Quantization via Mixup-class Prompting [8.107092196905157]
Post-training quantization (PTQ) improves efficiency but struggles with limited calibration data, especially under privacy constraints.<n>Data-free quantization (DFQ) mitigates this by generating synthetic images using generative models such as generative adversarial networks (GANs) and text-conditioned latent diffusion models (LDMs)<n>We propose textbfmixup-class prompt, a mixup-based text prompting strategy that fuses multiple class labels at the text prompt level to generate diverse, robust synthetic data.
arXiv Detail & Related papers (2025-07-29T16:00:20Z)
DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning [9.221916791064407]
Data-Free Quantization (DFQ) enables the quantization of Vision Transformers (ViTs) without requiring access to data, allowing for the deployment of ViTs on devices with limited resources.<n>Existing methods fail to fully capture and balance the global and local features within the samples, resulting in limited synthetic data quality.<n>We propose a pipeline for Data-Free Quantization for Vision Transformers (DFQ-ViT)
arXiv Detail & Related papers (2025-07-19T04:32:04Z)
Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z)
You Only Train Once [11.97836331714694]
You Only Train Once (YOTO) contributes to limiting training to one shot for the latter aspect of losses selection and weighting.<n>We leverage the differentiability of the composite loss formulation which is widely used for optimizing multiple empirical losses simultaneously.<n>We show that YOTO consistently outperforms the best grid-search model on unseen test data.
arXiv Detail & Related papers (2025-06-04T18:04:58Z)
Solving Inverse Problems with FLAIR [59.02385492199431]
Flow-based latent generative models are able to generate images with remarkable quality, even enabling text-to-image generation.<n>We present FLAIR, a novel training free variational framework that leverages flow-based generative models as a prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z)
ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model [51.83639270669481]
Unsupervised anomaly detection in hyperspectral images (HSI) aims to detect unknown targets from backgrounds.<n>HSI studies are hindered by steep computational costs due to the high-dimensional property of HSI and dense sampling-based training paradigm.<n>We propose an Asymmetrical Consensus State Space Model (ACMamba) to significantly reduce computational costs without compromising accuracy.
arXiv Detail & Related papers (2025-04-16T05:33:42Z)
Strengthening Anomaly Awareness [0.0]
We present a refined version of the Anomaly Awareness framework for enhancing unsupervised anomaly detection.<n>Our approach introduces minimal supervision into Variational Autoencoders (VAEs) through a two-stage training strategy.
arXiv Detail & Related papers (2025-04-15T16:52:22Z)
Ultra-Resolution Adaptation with Ease [62.56434979517156]
We propose a set of key guidelines for ultra-resolution adaptation termed emphURAE.<n>We show that tuning minor components of the weight matrices outperforms widely-used low-rank adapters when synthetic data are unavailable.<n>Experiments validate that URAE achieves comparable 2K-generation performance to state-of-the-art closed-source models like FLUX1.1 [Pro] Ultra with only 3K samples and 2K iterations.
arXiv Detail & Related papers (2025-03-20T16:44:43Z)
Progressive Boundary Guided Anomaly Synthesis for Industrial Anomaly Detection [1.5680795779726031]
Unsupervised anomaly detection methods can identify surface defects in industrial images by leveraging only normal samples for training.<n>We propose a novel Progressive Boundary-guided Anomaly Synthesis (PBAS) strategy, which can directionally synthesize crucial feature-level anomalies without auxiliary textures.<n>Our method achieves state-of-the-art performance and the fastest detection speed on three widely used industrial datasets.
arXiv Detail & Related papers (2024-12-23T10:26:26Z)
Diffusion Prior Interpolation for Flexibility Real-World Face Super-Resolution [48.34173818491552]
Diffusion Prior Interpolation (DPI) can balance consistency and diversity and can be seamlessly integrated into pre-trained models.<n>In extensive experiments conducted on synthetic and real datasets, DPI demonstrates superiority over SOTA FSR methods.
arXiv Detail & Related papers (2024-12-21T09:28:44Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image. In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting. Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z)
You Only Train Once: A Unified Framework for Both Full-Reference and No-Reference Image Quality Assessment [45.62136459502005]
We propose a network to perform full reference (FR) and no reference (NR) IQA. We first employ an encoder to extract multi-level features from input images. A Hierarchical Attention (HA) module is proposed as a universal adapter for both FR and NR inputs. A Semantic Distortion Aware (SDA) module is proposed to examine feature correlations between shallow and deep layers of the encoder.
arXiv Detail & Related papers (2023-10-14T11:03:04Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)
Achieving state-of-the-art performance in the Medical Out-of-Distribution (MOOD) challenge using plausible synthetic anomalies [0.5677301320664404]
Unsupervised anomaly detection, or Out-of-Distribution detection, aims at identifying anomalous samples. Our method builds upon the self-supervised strategy consisting on training a segmentation network to identify local synthetic anomalies. Our contributions improve the synthetic anomaly generation process, making synthetic anomalies more heterogeneous.
arXiv Detail & Related papers (2023-08-02T20:16:13Z)
Data-driven generation of plausible tissue geometries for realistic photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties. We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate" We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z)
Dueling Deep Q-Network for Unsupervised Inter-frame Eye Movement Correction in Optical Coherence Tomography Volumes [5.371290280449071]
In optical coherence tomography ( OCT) volumes of retina, the sequential acquisition of the individual slices makes this modality prone to motion artifacts. Speckle noise that is characteristic of this imaging modality, leads to inaccuracies when traditional registration techniques are employed. In this paper, we tackle these issues by using deep reinforcement learning to correct inter-frame movements in an unsupervised manner.
arXiv Detail & Related papers (2020-07-03T07:14:30Z)
Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and Wild [98.48284827503409]
We develop a textitunified BIQA model and an approach of training it for both synthetic and realistic distortions. We employ the fidelity loss to optimize a deep neural network for BIQA over a large number of such image pairs. Experiments on six IQA databases show the promise of the learned method in blindly assessing image quality in the laboratory and wild.
arXiv Detail & Related papers (2020-05-28T13:35:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.