Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly
Types
- URL: http://arxiv.org/abs/2112.11573v1
- Date: Tue, 21 Dec 2021 23:11:33 GMT
- Title: Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly
Types
- Authors: Kihyuk Sohn, Jinsung Yoon, Chun-Liang Li, Chen-Yu Lee, Tomas Pfister
- Abstract summary: We introduce anomaly clustering, whose goal is to group data into coherent clusters of anomaly types.
This is different from anomaly detection, whose goal is to divide anomalies from normal data.
We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods.
- Score: 60.45942774425782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce anomaly clustering, whose goal is to group data into
semantically coherent clusters of anomaly types. This is different from anomaly
detection, whose goal is to divide anomalies from normal data. Unlike
object-centered image clustering applications, anomaly clustering is
particularly challenging as anomalous patterns are subtle and local. We present
a simple yet effective clustering framework using a patch-based pretrained deep
embeddings and off-the-shelf clustering methods. We define a distance function
between images, each of which is represented as a bag of embeddings, by the
Euclidean distance between weighted averaged embeddings. The weight defines the
importance of instances (i.e., patch embeddings) in the bag, which may
highlight defective regions. We compute weights in an unsupervised way or in a
semi-supervised way if labeled normal data is available. Extensive experimental
studies show the effectiveness of the proposed clustering framework along with
a novel distance function upon existing multiple instance or deep clustering
frameworks. Overall, our framework achieves 0.451 and 0.674 normalized mutual
information scores on MVTec object and texture categories and further improve
with a few labeled normal data (0.577, 0.669), far exceeding the baselines
(0.244, 0.273) or state-of-the-art deep clustering methods (0.176, 0.277).
Related papers
- Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - A Computational Theory and Semi-Supervised Algorithm for Clustering [0.0]
A semi-supervised clustering algorithm is presented.
The kernel of the clustering method is Mohammad's anomaly detection algorithm.
Results are presented on synthetic and realworld data sets.
arXiv Detail & Related papers (2023-06-12T09:15:58Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Contrastive Hierarchical Clustering [8.068701201341065]
CoHiClust is a Contrastive Hierarchical Clustering model based on deep neural networks.
By employing a self-supervised learning approach, CoHiClust distills the base network into a binary tree without access to any labeled data.
arXiv Detail & Related papers (2023-03-03T07:54:19Z) - Inv-SENnet: Invariant Self Expression Network for clustering under
biased data [17.25929452126843]
We propose a novel framework for jointly removing unwanted attributes (biases) while learning to cluster data points in individual subspaces.
Our experimental result on synthetic and real-world datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-11-13T01:19:06Z) - On Mitigating Hard Clusters for Face Clustering [48.39472979642971]
Face clustering is a promising way to scale up face recognition systems using large-scale unlabeled face images.
We introduce two novel modules, Neighborhood-Diffusion-based Density (NDDe) and Transition-Probability-based Distance (TPDi)
Our experiments on multiple benchmarks show that each module contributes to the final performance of our method.
arXiv Detail & Related papers (2022-07-25T03:55:15Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Spectral Clustering with Smooth Tiny Clusters [14.483043753721256]
We propose a novel clustering algorithm, which con-siders the smoothness of data for the first time.
Our key idea is to cluster tiny clusters, whose centers constitute smooth graphs.
Although in this paper, we singly focus on multi-scale situations, the idea of data smoothness can certainly be extended to any clustering algorithms.
arXiv Detail & Related papers (2020-09-10T05:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.