Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly
Detection
- URL: http://arxiv.org/abs/2402.00448v1
- Date: Thu, 1 Feb 2024 09:32:39 GMT
- Title: Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly
Detection
- Authors: Liyi Yao, Shaobing Gao
- Abstract summary: Student-teacher networks (S-T) are favored in unsupervised anomaly detection.
However, vanilla S-T networks are not stable.
We propose a novel dual-student knowledge distillation architecture.
- Score: 2.06682776181122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the data imbalance and the diversity of defects, student-teacher
networks (S-T) are favored in unsupervised anomaly detection, which explores
the discrepancy in feature representation derived from the knowledge
distillation process to recognize anomalies. However, vanilla S-T network is
not stable. Employing identical structures to construct the S-T network may
weaken the representative discrepancy on anomalies. But using different
structures can increase the likelihood of divergent performance on normal data.
To address this problem, we propose a novel dual-student knowledge distillation
(DSKD) architecture. Different from other S-T networks, we use two student
networks a single pre-trained teacher network, where the students have the same
scale but inverted structures. This framework can enhance the distillation
effect to improve the consistency in recognition of normal data, and
simultaneously introduce diversity for anomaly representation. To explore
high-dimensional semantic information to capture anomaly clues, we employ two
strategies. First, a pyramid matching mode is used to perform knowledge
distillation on multi-scale feature maps in the intermediate layers of
networks. Second, an interaction is facilitated between the two student
networks through a deep feature embedding module, which is inspired by
real-world group discussions. In terms of classification, we obtain pixel-wise
anomaly segmentation maps by measuring the discrepancy between the output
feature maps of the teacher and student networks, from which an anomaly score
is computed for sample-wise determination. We evaluate DSKD on three benchmark
datasets and probe the effects of internal modules through ablation
experiments. The results demonstrate that DSKD can achieve exceptional
performance on small models like ResNet18 and effectively improve vanilla S-T
networks.
Related papers
- Memoryless Multimodal Anomaly Detection via Student-Teacher Network and Signed Distance Learning [8.610387986933741]
A novel memoryless method MDSS is proposed for multimodal anomaly detection.
It employs a light-weighted student-teacher network and a signed distance function to learn from RGB images and 3D point clouds respectively.
The experimental results indicate that MDSS is comparable but more stable than the SOTA memory bank based method Shape-guided.
arXiv Detail & Related papers (2024-09-09T07:18:09Z) - Dual-Modeling Decouple Distillation for Unsupervised Anomaly Detection [15.89869857998053]
Over-generalization of the student network to the teacher network may lead to negligible differences in representation capabilities of anomaly.
Existing methods address the possible over-generalization by using differentiated students and teachers from the structural perspective.
We propose Dual-Modeling Decouple Distillation (DMDD) for the unsupervised anomaly detection.
arXiv Detail & Related papers (2024-08-07T16:39:16Z) - Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection [4.0679780034913335]
A knowledge-distillation based multi-class anomaly detection promises a low latency with a reasonably good performance but with a significant drop as compared to one-class version.
We propose a DCAM (Distributed Convolutional Attention Module) which improves the distillation process between teacher and student networks.
arXiv Detail & Related papers (2024-05-10T13:25:39Z) - Large Language Model Guided Knowledge Distillation for Time Series
Anomaly Detection [12.585365177675607]
AnomalyLLM demonstrates state-of-the-art performance on 15 datasets, improving accuracy by at least 14.5% in the UCR dataset.
arXiv Detail & Related papers (2024-01-26T09:51:07Z) - ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly
Detection [75.68023968735523]
Knowledge Distillation-based Anomaly Detection (KDAD) methods rely on the teacher-student paradigm to detect and segment anomalous regions.
We propose an innovative approach called Asymmetric Distillation Post-Segmentation (ADPS)
Our ADPS employs an asymmetric distillation paradigm that takes distinct forms of the same image as the input of the teacher-student networks.
We show that ADPS significantly improves Average Precision (AP) metric by 9% and 20% on the MVTec AD and KolektorSDD2 datasets.
arXiv Detail & Related papers (2022-10-19T12:04:47Z) - Asymmetric Student-Teacher Networks for Industrial Anomaly Detection [22.641661538154054]
This work discovers previously unknown problems of student-teacher approaches for anomaly detection.
Two neural networks are trained to produce the same output for the defect-free training examples.
Our method produces state-of-the-art results on the two currently most relevant defect detection datasets MVTec AD and MVTec 3D-AD.
arXiv Detail & Related papers (2022-10-14T13:56:50Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.