Attention and Self-Attention in Random Forests
- URL: http://arxiv.org/abs/2207.04293v1
- Date: Sat, 9 Jul 2022 16:15:53 GMT
- Title: Attention and Self-Attention in Random Forests
- Authors: Lev V. Utkin and Andrei V. Konstantinov
- Abstract summary: New models of random forests jointly using the attention and self-attention mechanisms are proposed.
The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest.
- Score: 5.482532589225552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: New models of random forests jointly using the attention and self-attention
mechanisms are proposed for solving the regression problem. The models can be
regarded as extensions of the attention-based random forest whose idea stems
from applying a combination of the Nadaraya-Watson kernel regression and the
Huber's contamination model to random forests. The self-attention aims to
capture dependencies of the tree predictions and to remove noise or anomalous
predictions in the random forest. The self-attention module is trained jointly
with the attention module for computing weights. It is shown that the training
process of attention weights is reduced to solving a single quadratic or linear
optimization problem. Three modifications of the general approach are proposed
and compared. A specific multi-head self-attention for the random forest is
also considered. Heads of the self-attention are obtained by changing its
tuning parameters including the kernel parameters and the contamination
parameter of models. Numerical experiments with various datasets illustrate the
proposed models and show that the supplement of the self-attention improves the
model performance for many datasets.
Related papers
- LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Neural Attention Forests: Transformer-Based Forest Improvement [4.129225533930966]
The main idea behind the proposed NAF model is to introduce the attention mechanism into the random forest.
In contrast to the available models like the attention-based random forest, the attention weights and the Nadaraya-Watson regression are represented in the form of neural networks.
The combination of the random forest and neural networks implementing the attention mechanism forms a transformer for enhancing the forest predictions.
arXiv Detail & Related papers (2023-04-12T17:01:38Z) - Improved Anomaly Detection by Using the Attention-Based Isolation Forest [4.640835690336653]
Attention-Based Isolation Forest (ABIForest) for solving anomaly detection problem is proposed.
The main idea is to assign attention weights to each path of trees with learnable parameters depending on instances and trees themselves.
ABIForest can be viewed as the first modification of Isolation Forest, which incorporates the attention mechanism in a simple way without applying gradient-based algorithms.
arXiv Detail & Related papers (2022-10-05T20:58:57Z) - AGBoost: Attention-based Modification of Gradient Boosting Machine [0.0]
A new attention-based model for the gradient boosting machine (GBM) called AGBoost is proposed for solving regression problems.
The main idea behind the proposed AGBoost model is to assign attention weights with trainable parameters to iterations of GBM.
arXiv Detail & Related papers (2022-07-12T17:42:20Z) - An Approximation Method for Fitted Random Forests [0.0]
We study methods that approximate each fitted tree in the Random Forests model using the multinomial allocation of the data points to the leafs.
Specifically, we begin by studying whether fitting a multinomial logistic regression helps reduce the size while preserving the prediction quality.
arXiv Detail & Related papers (2022-07-05T17:28:52Z) - Attention-based Random Forest and Contamination Model [5.482532589225552]
The main idea behind the proposed ABRF models is to assign attention weights with trainable parameters to decision trees in a specific way.
The weights depend on the distance between an instance, which falls into a corresponding leaf of a tree, and instances, which fall in the same leaf.
arXiv Detail & Related papers (2022-01-08T19:35:57Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Bayesian Attention Modules [65.52970388117923]
We propose a scalable version of attention that is easy to implement and optimize.
Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
arXiv Detail & Related papers (2020-10-20T20:30:55Z) - Censored Quantile Regression Forest [81.9098291337097]
We develop a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring.
The proposed procedure named it censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption.
arXiv Detail & Related papers (2020-01-08T23:20:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.