Distribution and volume based scoring for Isolation Forests
- URL: http://arxiv.org/abs/2309.11450v1
- Date: Wed, 20 Sep 2023 16:27:10 GMT
- Title: Distribution and volume based scoring for Isolation Forests
- Authors: Hichem Dhouib, Alissa Wilms, Paul Boes
- Abstract summary: We make two contributions to the Isolation Forest method for anomaly and outlier detection.
The first is an information-theoretically motivated generalisation of the score function that is used to aggregate the scores across random tree estimators.
The second is an alternative scoring function at the level of the individual tree estimator, in which we replace the depth-based scoring of the Isolation Forest with one based on hyper-volumes associated to an isolation tree's leaf nodes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We make two contributions to the Isolation Forest method for anomaly and
outlier detection. The first contribution is an information-theoretically
motivated generalisation of the score function that is used to aggregate the
scores across random tree estimators. This generalisation allows one to take
into account not just the ensemble average across trees but instead the whole
distribution. The second contribution is an alternative scoring function at the
level of the individual tree estimator, in which we replace the depth-based
scoring of the Isolation Forest with one based on hyper-volumes associated to
an isolation tree's leaf nodes.
We motivate the use of both of these methods on generated data and also
evaluate them on 34 datasets from the recent and exhaustive ``ADBench''
benchmark, finding significant improvement over the standard isolation forest
for both variants on some datasets and improvement on average across all
datasets for one of the two variants. The code to reproduce our results is made
available as part of the submission.
Related papers
- Learning Order Forest for Qualitative-Attribute Data Clustering [52.612779710298526]
This paper discovers a tree-like distance structure to flexibly represent the local order relationship among intra-attribute qualitative values.<n>A joint learning mechanism is proposed to iteratively obtain more appropriate tree structures and clusters.<n>Experiments demonstrate that the joint learning adapts the forest to the clustering task to yield accurate results.
arXiv Detail & Related papers (2026-03-03T07:49:50Z) - TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees [73.0940890296463]
probabilistic values are used to rank features for explaining local predicted values of decision trees.<n>TreeGrad computes the gradients of the multilinear extension of the joint objective in $O(L)$ time for decision trees with $L$ leaves.<n>TreeGrad-Ranker aggregates the gradients while optimizing the joint objective to produce feature rankings.<n>TreeGrad-Shap is a numerically stable algorithm for computing Beta Shapley values with integral parameters.
arXiv Detail & Related papers (2026-02-12T06:17:12Z) - Partition Trees: Conditional Density Estimation over General Outcome Spaces [46.1988967916659]
We propose Partition Trees, a tree-based framework for conditional density estimation over general outcome spaces.<n>Our approach models conditional distributions as piecewise-constant densities on data adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood.
arXiv Detail & Related papers (2026-02-03T22:12:30Z) - Jacobian Aligned Random Forests [0.0]
We present Jacobian-Aligned Random Forests (JARF) as an alternative to oblique forests for axis-aligned decision trees.<n>We show that JARF consistently improves axis-aligned forests and often matches or surpasses baselines while improving training time.<n>Our experimental results and theoretical analysis together indicate that supervised preconditioning can recover much of the accuracy of oblique forests while retaining the simplicity and robustness of axis-aligned trees.
arXiv Detail & Related papers (2025-12-09T07:08:04Z) - TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling [65.46347858249295]
TreePO is a self-guided rollout algorithm that views sequence generation as a tree-structured searching process.<n>TreePO essentially reduces the per-update compute burden while preserving or enhancing exploration diversity.
arXiv Detail & Related papers (2025-08-24T16:52:37Z) - Diversity Conscious Refined Random Forest [0.0]
Random Forest (RF) is a widely used ensemble learning technique.<n>RF often relies on hundreds of trees and all input features, leading to high cost and model redundancy.<n>We propose a Refined Random Forest that grows trees only on informative features and enforces maximal diversity by clustering and retaining uncorrelated trees.
arXiv Detail & Related papers (2025-07-01T06:28:15Z) - TreeRPO: Tree Relative Policy Optimization [55.97385410074841]
name is a novel method that estimates the mathematical expectations of rewards at various reasoning steps using tree sampling.<n>Building on the group-relative reward training mechanism of GRPO, name innovatively computes rewards based on step-level groups generated during tree sampling.
arXiv Detail & Related papers (2025-06-05T15:56:38Z) - SureMap: Simultaneous Mean Estimation for Single-Task and Multi-Task Disaggregated Evaluation [75.56845750400116]
Disaggregated evaluation -- estimation of performance of a machine learning model on different subpopulations -- is a core task when assessing performance and group-fairness of AI systems.
We develop SureMap that has high estimation accuracy for both multi-task and single-task disaggregated evaluations of blackbox models.
Our method combines maximum a posteriori (MAP) estimation using a well-chosen prior together with cross-validation-free tuning via Stein's unbiased risk estimate (SURE)
arXiv Detail & Related papers (2024-11-14T17:53:35Z) - Heterogeneous Random Forest [2.0646127669654835]
Heterogeneous Random Forest (HRF) is designed to enhance tree diversity in a meaningful way.
HRF consistently outperformed other ensemble methods in terms of accuracy across the majority of datasets.
arXiv Detail & Related papers (2024-10-24T09:18:55Z) - Ensembles of Probabilistic Regression Trees [46.53457774230618]
Tree-based ensemble methods have been successfully used for regression problems in many applications and research studies.
We study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution.
arXiv Detail & Related papers (2024-06-20T06:51:51Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Individualized and Global Feature Attributions for Gradient Boosted
Trees in the Presence of $\ell_2$ Regularization [0.0]
We propose Prediction Decomposition (PreDecomp), a novel individualized feature attribution for boosted trees when they are trained with $ell$ regularization.
We also propose TreeInner, a family of debiased global feature attributions defined in terms of the inner product between any individualized feature attribution and labels on out-sample data for each tree.
arXiv Detail & Related papers (2022-11-08T17:56:22Z) - Data-driven advice for interpreting local and global model predictions
in bioinformatics problems [17.685881417954782]
Conditional feature contributions (CFCs) provide textitlocal, case-by-case explanations of a prediction.
We compare the explanations computed by both methods on a set of 164 publicly available classification problems.
For random forests, we find extremely high similarities and correlations of both local and global SHAP values and CFC scores.
arXiv Detail & Related papers (2021-08-13T12:41:39Z) - Eigen Analysis of Self-Attention and its Reconstruction from Partial
Computation [58.80806716024701]
We study the global structure of attention scores computed using dot-product based self-attention.
We find that most of the variation among attention scores lie in a low-dimensional eigenspace.
We propose to compute scores only for a partial subset of token pairs, and use them to estimate scores for the remaining pairs.
arXiv Detail & Related papers (2021-06-16T14:38:42Z) - Optimal trees selection for classification via out-of-bag assessment and
sub-bagging [0.0]
The predictive performance of tree based machine learning methods, in general, improves with a decreasing rate as the size of training data increases.
We investigate this in optimal trees ensemble (OTE) where the method fails to learn from some of the training observations due to internal validation.
Modified tree selection methods are thus proposed for OTE to cater for the loss of training observations in internal validation.
arXiv Detail & Related papers (2020-12-30T19:44:11Z) - JSRT: James-Stein Regression Tree [55.2059664267247]
Regression tree (RT) has been widely used in machine learning and data mining community.
In practice, the performance of RT relies heavily on the local mean of samples from an individual node during the tree construction/prediction stage.
We propose a novel regression tree, named James-Stein Regression Tree (JSRT) by considering global information from different nodes.
arXiv Detail & Related papers (2020-10-18T16:28:49Z) - A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains.
We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z) - Guided Random Forest and its application to data approximation [2.6149031653868993]
We extend the idea of building oblique decision trees with localized partitioning to obtain a global partitioning.<n>We empirically demonstrate that global partitioning reduces the generalization error bound.<n>Results on 115 benchmark datasets show that GRAF yields comparable or better results on a majority of datasets.
arXiv Detail & Related papers (2019-09-02T10:50:14Z) - Fr\'echet random forests for metric space valued regression with non
euclidean predictors [0.0]
We introduce Fr'echet trees and Fr'echet random forests, which allow to handle data for which input and output variables take values in general metric spaces.
A consistency theorem for Fr'echet regressogram predictor using data-driven partitions is given and applied to Fr'echet purely uniformly random trees.
arXiv Detail & Related papers (2019-06-04T22:07:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.