Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection
- URL: http://arxiv.org/abs/2508.03539v1
- Date: Tue, 05 Aug 2025 15:07:32 GMT
- Title: Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection
- Authors: Long Qian, Bingke Zhu, Yingying Chen, Ming Tang, Jinqiao Wang,
- Abstract summary: ARAS is a language-conditioned, auto-regressive anomaly synthesis approach.<n>It injects local, text-specified defects into normal images via token-anchored latent editing.<n>It significantly enhances defect realism, preserves fine-grained material textures, and provides continuous semantic control over synthesized anomalies.
- Score: 30.77558600436759
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite substantial progress in anomaly synthesis methods, existing diffusion-based and coarse inpainting pipelines commonly suffer from structural deficiencies such as micro-structural discontinuities, limited semantic controllability, and inefficient generation. To overcome these limitations, we introduce ARAS, a language-conditioned, auto-regressive anomaly synthesis approach that precisely injects local, text-specified defects into normal images via token-anchored latent editing. Leveraging a hard-gated auto-regressive operator and a training-free, context-preserving masked sampling kernel, ARAS significantly enhances defect realism, preserves fine-grained material textures, and provides continuous semantic control over synthesized anomalies. Integrated within our Quality-Aware Re-weighted Anomaly Detection (QARAD) framework, we further propose a dynamic weighting strategy that emphasizes high-quality synthetic samples by computing an image-text similarity score with a dual-encoder model. Extensive experiments across three benchmark datasets-MVTec AD, VisA, and BTAD, demonstrate that our QARAD outperforms SOTA methods in both image- and pixel-level anomaly detection tasks, achieving improved accuracy, robustness, and a 5 times synthesis speedup compared to diffusion-based alternatives. Our complete code and synthesized dataset will be publicly available.
Related papers
- ERGO: Excess-Risk-Guided Optimization for High-Fidelity Monocular 3D Gaussian Splatting [63.138778159026934]
We propose an adaptive optimization framework guided by excess risk decomposition, termed ERGO.<n> ERGO dynamically estimates the view-specific excess risk and adaptively adjust loss weights during optimization.<n>Experiments on the Google Scanned Objects dataset and the OmniObject3D dataset demonstrate the superiority of ERGO over existing state-of-the-art methods.
arXiv Detail & Related papers (2026-02-10T20:44:43Z) - Towards Syn-to-Real IQA: A Novel Perspective on Reshaping Synthetic Data Distributions [74.00222571094437]
Blind Image Quality Assessment (BIQA) has advanced significantly through deep learning, but the scarcity of large-scale labeled datasets remains a challenge.<n>We make a key observation that representations learned from synthetic datasets often exhibit a discrete and clustered pattern that hinders regression performance.<n>We introduce a novel framework SynDR-IQA, which reshapes synthetic data distribution to enhance BIQA generalization.
arXiv Detail & Related papers (2026-01-01T06:11:16Z) - Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion Framework [27.71603877164877]
Optical and Synthetic Aperture Radar (SAR) fusion-based object detection has attracted significant research interest in remote sensing.<n>We propose a novel Quality-Aware Dynamic Fusion Network (QDFNet) for robust optical-SAR object detection.
arXiv Detail & Related papers (2025-12-27T03:16:48Z) - A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - RareFlow: Physics-Aware Flow-Matching for Cross-Sensor Super-Resolution of Rare-Earth Features [27.505614464585538]
We present RareFlow, a physics-aware SR framework designed for out-of-distribution (OOD) robustness.<n>A Gated ControlNet preserves fine-grained geometric fidelity from the low-resolution input, while textual prompts provide semantic guidance for synthesizing complex features.<n>In blind evaluations, geophysical experts rated our model's outputs as approaching the fidelity of ground truth imagery, significantly outperforming state-of-the-art baselines.
arXiv Detail & Related papers (2025-10-27T19:56:43Z) - $\f{D^3}$QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection [85.9202830503973]
Visual autoregressive (AR) models generate images through discrete token prediction.<n>We propose to leverage Discrete Distribution Discrepancy-aware Quantization Error (D$3$QE) for autoregressive-generated image detection.
arXiv Detail & Related papers (2025-10-07T13:02:27Z) - Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection [30.77558600436759]
We introduce a novel and lightweight pipeline that generates synthetic anomalies through Math-Phys model guidance.<n>Our method produces realistic defect masks, which are subsequently enhanced in two phases.<n>To validate our method, we conduct experiments on three anomaly detection benchmarks: MVTec AD, VisA, and BTAD.
arXiv Detail & Related papers (2025-04-17T14:22:27Z) - Component-aware Unsupervised Logical Anomaly Generation for Industrial Anomaly Detection [31.27483219228598]
Anomaly detection is critical in industrial manufacturing for ensuring product quality and improving efficiency in automated processes.<n>Recent generative models often produce unrealistic anomalies increasing false positives, or require real-world anomaly samples for training.<n>We propose ComGEN, a component-aware and unsupervised framework that addresses the gap in logical anomaly generation.
arXiv Detail & Related papers (2025-02-17T11:54:43Z) - Progressive Boundary Guided Anomaly Synthesis for Industrial Anomaly Detection [1.5680795779726031]
Unsupervised anomaly detection methods can identify surface defects in industrial images by leveraging only normal samples for training.<n>We propose a novel Progressive Boundary-guided Anomaly Synthesis (PBAS) strategy, which can directionally synthesize crucial feature-level anomalies without auxiliary textures.<n>Our method achieves state-of-the-art performance and the fastest detection speed on three widely used industrial datasets.
arXiv Detail & Related papers (2024-12-23T10:26:26Z) - Autoregressive Speech Synthesis without Vector Quantization [135.4776759536272]
We present MELLE, a novel continuous-valued token based language modeling approach for text-to-speech synthesis (TTS)<n>MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition.<n>MELLE mitigates robustness issues by avoiding the inherent flaws of sampling vector-quantized codes.
arXiv Detail & Related papers (2024-07-11T14:36:53Z) - Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - RealNet: A Feature Selection Network with Realistic Synthetic Anomaly
for Anomaly Detection [7.626097310990373]
We introduce RealNet, a feature reconstruction network with realistic synthetic anomaly and adaptive feature selection.
We develop Anomaly-aware Features Selection (AFS) and Reconstruction Residuals Selection (RRS)
Our results demonstrate significant improvements in both Image AUROC and Pixel AUROC compared to the current state-o-the-art methods.
arXiv Detail & Related papers (2024-03-09T12:25:01Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - You Only Train Once: A Unified Framework for Both Full-Reference and No-Reference Image Quality Assessment [45.62136459502005]
We propose a network to perform full reference (FR) and no reference (NR) IQA.
We first employ an encoder to extract multi-level features from input images.
A Hierarchical Attention (HA) module is proposed as a universal adapter for both FR and NR inputs.
A Semantic Distortion Aware (SDA) module is proposed to examine feature correlations between shallow and deep layers of the encoder.
arXiv Detail & Related papers (2023-10-14T11:03:04Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Unsupervised Domain Adaptive Salient Object Detection Through
Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations.
We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z) - The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss.
Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU.
The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.