Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image
Anomaly Detection
- URL: http://arxiv.org/abs/2308.02983v1
- Date: Sun, 6 Aug 2023 01:30:26 GMT
- Title: Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image
Anomaly Detection
- Authors: Xincheng Yao and Ruoqi Li and Zefeng Qian and Yan Luo and Chongyang
Zhang
- Abstract summary: FOcus-the-Discrepancy (FOD) can simultaneously spot the patch-wise, intra- and inter-discrepancies of anomalies.
In this paper, we propose a novel AD framework: FOcus-the-Discrepancy (FOD), which can simultaneously spot the patch-wise, intra- and inter-discrepancies of anomalies.
- Score: 13.801572236048601
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans recognize anomalies through two aspects: larger patch-wise
representation discrepancies and weaker patch-to-normal-patch correlations.
However, the previous AD methods didn't sufficiently combine the two
complementary aspects to design AD models. To this end, we find that
Transformer can ideally satisfy the two aspects as its great power in the
unified modeling of patch-wise representations and patch-to-patch correlations.
In this paper, we propose a novel AD framework: FOcus-the-Discrepancy (FOD),
which can simultaneously spot the patch-wise, intra- and inter-discrepancies of
anomalies. The major characteristic of our method is that we renovate the
self-attention maps in transformers to Intra-Inter-Correlation (I2Correlation).
The I2Correlation contains a two-branch structure to first explicitly establish
intra- and inter-image correlations, and then fuses the features of two-branch
to spotlight the abnormal patterns. To learn the intra- and inter-correlations
adaptively, we propose the RBF-kernel-based target-correlations as learning
targets for self-supervised learning. Besides, we introduce an entropy
constraint strategy to solve the mode collapse issue in optimization and
further amplify the normal-abnormal distinguishability. Extensive experiments
on three unsupervised real-world AD benchmarks show the superior performance of
our approach. Code will be available at https://github.com/xcyao00/FOD.
Related papers
- Relation DETR: Exploring Explicit Position Relation Prior for Object Detection [26.03892270020559]
We present a scheme for enhancing the convergence and performance of DETR (DEtection TRansformer)
Our approach, termed Relation-DETR, introduces an encoder to construct position relation embeddings for progressive attention refinement.
Experiments on both generic and task-specific datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-07-16T13:17:07Z) - Cross-Modal Learning for Anomaly Detection in Complex Industrial Process: Methodology and Benchmark [19.376814754500625]
Anomaly detection in complex industrial processes plays a pivotal role in ensuring efficient, stable, and secure operation.
This paper proposes a cross-modal Transformer to facilitate anomaly detection by exploring the correlation between visual features (video) and process variables (current) in the context of the fused magnesium smelting process.
We present a pioneering cross-modal benchmark of the fused magnesium smelting process, featuring synchronously acquired video and current data for over 2.2 million samples.
arXiv Detail & Related papers (2024-06-13T11:40:06Z) - Fully Differentiable Correlation-driven 2D/3D Registration for X-ray to CT Image Fusion [3.868072865207522]
Image-based rigid 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions.
We propose a novel fully differentiable correlation-driven network using a dual-branch CNN-transformer encoder.
A correlation-driven loss is proposed for low-frequency feature and high-frequency feature decomposition based on embedded information.
arXiv Detail & Related papers (2024-02-04T14:12:51Z) - Forgery-aware Adaptive Transformer for Generalizable Synthetic Image
Detection [106.39544368711427]
We study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods.
We present a novel forgery-aware adaptive transformer approach, namely FatFormer.
Our approach tuned on 4-class ProGAN data attains an average of 98% accuracy to unseen GANs, and surprisingly generalizes to unseen diffusion models with 95% accuracy.
arXiv Detail & Related papers (2023-12-27T17:36:32Z) - Fourier Test-time Adaptation with Multi-level Consistency for Robust
Classification [10.291631977766672]
We propose a novel approach called Fourier Test-time Adaptation (FTTA) to integrate input and model tuning.
FTTA builds a reliable multi-level consistency measurement of paired inputs for achieving self-supervised of prediction.
It was extensively validated on three large classification datasets with different modalities and organs.
arXiv Detail & Related papers (2023-06-05T02:29:38Z) - Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
Semantic Segmentation [51.10389829070684]
Domain gap can cause discrepancies in self-attention.
Due to this gap, the transformer attends to spurious regions or pixels, which deteriorates accuracy on the target domain.
We propose adaptation on attention maps with cross-domain attention layers.
arXiv Detail & Related papers (2022-11-27T02:40:33Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Relation Matters: Foreground-aware Graph-based Relational Reasoning for
Domain Adaptive Object Detection [81.07378219410182]
We propose a new and general framework for DomainD, named Foreground-aware Graph-based Reasoning (FGRR)
FGRR incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations.
Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art on four DomainD benchmarks.
arXiv Detail & Related papers (2022-06-06T05:12:48Z) - Short Range Correlation Transformer for Occluded Person
Re-Identification [4.339510167603376]
We propose a partial feature transformer-based person re-identification framework named PFT.
The proposed PFT utilizes three modules to enhance the efficiency of vision transformer.
Experimental results over occluded and holistic re-identification datasets demonstrate that the proposed PFT network achieves superior performance consistently.
arXiv Detail & Related papers (2022-01-04T11:12:39Z) - Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z) - Attentive WaveBlock: Complementarity-enhanced Mutual Networks for
Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB)
AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels.
Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.