TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML
scenarios
- URL: http://arxiv.org/abs/2111.15432v1
- Date: Tue, 30 Nov 2021 14:24:27 GMT
- Title: TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML
scenarios
- Authors: Tommaso Barbariol and Gian Antonio Susto
- Abstract summary: Isolation Forest is a popular algorithm able to define an anomaly score by means of an ensemble of peculiar trees called isolation trees.
We show that the standard algorithm might be improved in terms of memory requirements, latency and performances.
We propose TiWS-iForest, an approach that, by leveraging weak supervision, is able to reduce Isolation Forest complexity and to enhance detection performances.
- Score: 2.7285752469525315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unsupervised anomaly detection tackles the problem of finding anomalies
inside datasets without the labels availability; since data tagging is
typically hard or expensive to obtain, such approaches have seen huge
applicability in recent years. In this context, Isolation Forest is a popular
algorithm able to define an anomaly score by means of an ensemble of peculiar
trees called isolation trees. These are built using a random partitioning
procedure that is extremely fast and cheap to train. However, we find that the
standard algorithm might be improved in terms of memory requirements, latency
and performances; this is of particular importance in low resources scenarios
and in TinyML implementations on ultra-constrained microprocessors. Moreover,
Anomaly Detection approaches currently do not take advantage of weak
supervisions: being typically consumed in Decision Support Systems, feedback
from the users, even if rare, can be a valuable source of information that is
currently unexplored. Beside showing iForest training limitations, we propose
here TiWS-iForest, an approach that, by leveraging weak supervision is able to
reduce Isolation Forest complexity and to enhance detection performances. We
showed the effectiveness of TiWS-iForest on real word datasets and we share the
code in a public repository to enhance reproducibility.
Related papers
- CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset [14.246172794156987]
$textitCableInspect-AD$ is a high-quality dataset created and annotated by domain experts from Hydro-Qu'ebec, a Canadian public utility.
This dataset includes high-resolution images with challenging real-world anomalies, covering defects with varying severity levels.
We present a comprehensive evaluation protocol based on cross-validation to assess models' performances.
arXiv Detail & Related papers (2024-09-30T14:50:13Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - SoftPatch: Unsupervised Anomaly Detection with Noisy Data [67.38948127630644]
This paper considers label-level noise in image sensory anomaly detection for the first time.
We propose a memory-based unsupervised AD method, SoftPatch, which efficiently denoises the data at the patch level.
Compared with existing methods, SoftPatch maintains a strong modeling ability of normal data and alleviates the overconfidence problem in coreset.
arXiv Detail & Related papers (2024-03-21T08:49:34Z) - Hard Nominal Example-aware Template Mutual Matching for Industrial
Anomaly Detection [74.9262846410559]
textbfHard Nominal textbfExample-aware textbfTemplate textbfMutual textbfMatching (HETMM)
textitHETMM aims to construct a robust prototype-based decision boundary, which can precisely distinguish between hard-nominal examples and anomalies.
arXiv Detail & Related papers (2023-03-28T17:54:56Z) - Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing.
We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining.
An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z) - Active Learning-based Isolation Forest (ALIF): Enhancing Anomaly
Detection in Decision Support Systems [2.922007656878633]
ALIF is a lightweight modification of the popular Isolation Forest that proved superior performances with respect to other state-of-art algorithms.
The proposed approach is particularly appealing in the presence of a Decision Support System (DSS), a case that is increasingly popular in real-world scenarios.
arXiv Detail & Related papers (2022-07-08T14:36:38Z) - Deep Isolation Forest for Anomaly Detection [16.581154394513025]
Isolation forest (iForest) has been emerging as arguably the most popular anomaly detector in recent years.
Our model achieves significant improvement over state-of-the-art isolation-based methods and deep detectors on datasets.
arXiv Detail & Related papers (2022-06-14T05:47:07Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Interpretable Anomaly Detection with Mondrian P{\'o}lya Forests on Data
Streams [6.177270420667713]
Anomaly detection at scale is an extremely challenging problem of great practicality.
Recent work has coalesced on variations of (random) $k$emphd-trees to summarise data for anomaly detection.
These methods rely on ad-hoc score functions that are not easy to interpret.
We contextualise these methods in a probabilistic framework which we call the Mondrian Polya Forest.
arXiv Detail & Related papers (2020-08-04T13:19:07Z) - Interpretable Anomaly Detection with DIFFI: Depth-based Isolation Forest
Feature Importance [4.769747792846005]
Anomaly Detection is an unsupervised learning task aimed at detecting anomalous behaviours with respect to historical data.
The Isolation Forest is one of the most commonly adopted algorithms in the field of Anomaly Detection.
This paper proposes methods to define feature importance scores at both global and local level for the Isolation Forest.
arXiv Detail & Related papers (2020-07-21T22:19:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.