Interpretable Anomaly Detection with DIFFI: Depth-based Isolation Forest
Feature Importance
- URL: http://arxiv.org/abs/2007.11117v2
- Date: Tue, 13 Jul 2021 13:15:08 GMT
- Title: Interpretable Anomaly Detection with DIFFI: Depth-based Isolation Forest
Feature Importance
- Authors: Mattia Carletti, Matteo Terzi, Gian Antonio Susto
- Abstract summary: Anomaly Detection is an unsupervised learning task aimed at detecting anomalous behaviours with respect to historical data.
The Isolation Forest is one of the most commonly adopted algorithms in the field of Anomaly Detection.
This paper proposes methods to define feature importance scores at both global and local level for the Isolation Forest.
- Score: 4.769747792846005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly Detection is an unsupervised learning task aimed at detecting
anomalous behaviours with respect to historical data. In particular,
multivariate Anomaly Detection has an important role in many applications
thanks to the capability of summarizing the status of a complex system or
observed phenomenon with a single indicator (typically called `Anomaly Score')
and thanks to the unsupervised nature of the task that does not require human
tagging. The Isolation Forest is one of the most commonly adopted algorithms in
the field of Anomaly Detection, due to its proven effectiveness and low
computational complexity. A major problem affecting Isolation Forest is
represented by the lack of interpretability, an effect of the inherent
randomness governing the splits performed by the Isolation Trees, the building
blocks of the Isolation Forest. In this paper we propose effective, yet
computationally inexpensive, methods to define feature importance scores at
both global and local level for the Isolation Forest. Moreover, we define a
procedure to perform unsupervised feature selection for Anomaly Detection
problems based on our interpretability method; such procedure also serves the
purpose of tackling the challenging task of feature importance evaluation in
unsupervised anomaly detection. We assess the performance on several synthetic
and real-world datasets, including comparisons against state-of-the-art
interpretability techniques, and make the code publicly available to enhance
reproducibility and foster research in the field.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.