Related papers: AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning

AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning

URL: http://arxiv.org/abs/2505.03509v1
Date: Tue, 06 May 2025 13:19:15 GMT
Title: AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning
Authors: Pablo Gómez, David O'Ryan,
Abstract summary: AnomalyMatch is an anomaly detection framework combining the semi-supervised FixMatch algorithm with active learning.<n>AnomalyMatch is tailored for large-scale applications, efficiently processing predictions for 100 million images within three days on a single GPU.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Anomaly detection in large datasets is essential in fields such as astronomy and computer vision; however, supervised methods typically require extensive anomaly labelling, which is often impractical. We present AnomalyMatch, an anomaly detection framework combining the semi-supervised FixMatch algorithm using EfficientNet classifiers with active learning. By treating anomaly detection as a semi-supervised binary classification problem, we efficiently utilise limited labelled and abundant unlabelled images. We allow iterative model refinement in a user interface for expert verification of high-confidence anomalies and correction of false positives. Built for astronomical data, AnomalyMatch generalises readily to other domains facing similar data challenges. Evaluations on the GalaxyMNIST astronomical dataset and the miniImageNet natural-image benchmark under severe class imbalance (1% anomalies for miniImageNet) display strong performance: starting from five to ten labelled anomalies and after three active learning cycles, we achieve an average AUROC of 0.95 (miniImageNet) and 0.86 (GalaxyMNIST), with respective AUPRC of 0.77 and 0.71. After active learning cycles, anomalies are ranked with 71% (miniImageNet) to 93% precision in the 1% of the highest-ranked images. AnomalyMatch is tailored for large-scale applications, efficiently processing predictions for 100 million images within three days on a single GPU. Integrated into ESAs Datalabs platform, AnomalyMatch facilitates targeted discovery of scientifically valuable anomalies in vast astronomical datasets. Our results underscore the exceptional utility and scalability of this approach for anomaly discovery, highlighting the value of specialised approaches for domains characterised by severe label scarcity.

Related papers

ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.<n> equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.<n>Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z)
Self-supervised learning for classifying paranasal anomalies in the maxillary sinus [31.45131665942058]
Self-supervised learning can be used to learn representations from unlabelled data. There are no SSL methods designed for the downstream task of classifying paranasal anomalies in the maxillary sinus. Our approach uses a 3D Convolutional Autoencoder trained in an unsupervised anomaly detection framework.
arXiv Detail & Related papers (2024-04-29T11:14:11Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
UniFormaly: Towards Task-Agnostic Unified Framework for Visual Anomaly Detection [6.260747047974035]
We present UniFormaly, a universal and powerful anomaly detection framework. We emphasize the necessity of our off-the-shelf approach by pointing out a suboptimal issue in online encoder-based methods. UniFormaly achieves outstanding results on various tasks and datasets.
arXiv Detail & Related papers (2023-07-24T06:04:12Z)
One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations [59.17685450892182]
We investigate the behavior of deep representations in widely used CNN models under extreme data scarcity for One-Shot periocular recognition. We improved state-of-the-art results that made use of networks trained with biometric datasets with millions of images. Traditional algorithms like SIFT can outperform CNNs in situations with limited data.
arXiv Detail & Related papers (2023-07-11T09:10:16Z)
EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies [1.1602089225841632]
We propose a lightweight feature extractor that processes an image in less than a millisecond on a modern GPU. We then use a student-teacher approach to detect anomalous features. We evaluate our method, called EfficientAD, on 32 datasets from three industrial anomaly detection dataset collections.
arXiv Detail & Related papers (2023-03-25T18:48:33Z)
Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection [57.85347204640585]
We develop a Universal Domain Adaptation method DeepAstroUDA. It can be applied to datasets with different types of class overlap. For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets.
arXiv Detail & Related papers (2022-11-01T18:07:21Z)
From Unsupervised to Few-shot Graph Anomaly Detection: A Multi-scale Contrastive Learning Approach [26.973056364587766]
Anomaly detection from graph data is an important data mining task in many applications such as social networks, finance, and e-commerce. We propose a novel framework, graph ANomaly dEtection framework with Multi-scale cONtrastive lEarning (ANEMONE in short) By using a graph neural network as a backbone to encode the information from multiple graph scales (views), we learn better representation for nodes in a graph.
arXiv Detail & Related papers (2022-02-11T09:45:11Z)
Multi-Perspective Anomaly Detection [3.3511723893430476]
We build upon the deep support vector data description algorithm and address multi-perspective anomaly detection. We employ different augmentation techniques with a denoising process to deal with scarce one-class data. We evaluate our approach on the new dices dataset using images from two different perspectives and also benchmark on the standard MNIST dataset.
arXiv Detail & Related papers (2021-05-20T17:07:36Z)
TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs) To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics. To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)
Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset. Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data. We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.